Open to New Grad AI/ML · SWE · Data

Hi, I'm Yogeshvar.
I build things that ship.

// computer scientist · ml researcher · systems tinkerer

M.S. Computer Science at Penn State. I work across AI/ML research, low-level systems in C, and end-to-end data pipelines turning ideas into working software, consistently.

// about

About

A short version, because the work should speak louder.

I'm a Master's student in Computer Science at The Pennsylvania State University, working at the intersection of Artificial Intelligence, Machine Learning, NLP, and software engineering. I care about backend systems, scalable ML, and shipping things that actually run in production not just notebooks.

I'm looking for New Grad roles in AI/ML, Data Science, or Software Engineering where I can apply both research depth and engineering discipline. If you're hiring, or just want to nerd out about transformers and signal handlers let's talk.

// experience

Experience

Academic and professional journey.
Academic
Professional
  • M.S. Computer Science & Engineering

    The Pennsylvania State University · State College, PA

    Aug 2024 May 2026 · GPA 3.68/4.0
    • Coursework: Advanced Machine Learning, Deep Reinforcement Learning, Computer Vision, NLP, Advanced Operating Systems.
    • Active research on transformer-based time-series modeling, retrieval-augmented generation, and multi-agent AI systems.
  • B.Tech Computer Science & Engineering (AI/ML)

    SRM University AP · Amaravati, India

    Aug 2020 May 2024 · GPA 3.65/4.0
    • Coursework: Artificial Intelligence, Data Structures & Algorithms, Statistical Analysis, Database Systems, Social Network Analysis, Soft Computing.
    • Four peer-reviewed publications (Springer, IEEE) across forecasting, sentiment analysis, and information retrieval.
    • Gold Medal AI/ML category, 5th Research Day; Best Entrepreneur AMEYA 2K23.
  • Graduate Teaching & Research Assistant

    Penn State University · State College, PA

    Aug 2024 Present
    • Trained a hybrid BERT-DPRCNN text classifier (Transformer + CNN fusion) that hit 94.5% on Chinese and 91.9% on English benchmarks. Published at CAINE 2025, Springer CCIS.
    • Shipped a RAG chatbot for 50+ users recursive PDF chunking, sentence-transformer embeddings, FAISS ANN retrieval, deployed on Gradio. Lifted answer accuracy 35% via prompt and role-based tuning.
    • Built an ML validation harness (9 modular Python components, Streamlit review UI) with a confusion-matrix benchmarking layer that surfaced systematic error patterns across 8 classes.
    • Designed a multi-agent system on LangChain + Azure OpenAI; prototyped MCP server and A2A patterns for horizontally scalable agent backends.
    LangChainAzure OpenAIFAISSPyTorchHugging FaceGradioStreamlitMCP / A2A
  • Research Intern

    SRM University AP · Amaravati, India

    Aug 2023 May 2024
    • Built a hybrid ARIMA-LSTM forecaster with a moving-average integration layer; beat ARIMA-only and LSTM-only baselines on univariate benchmarks. Published in SN Computer Science, Springer (2024).
    • Lifted sentiment classification accuracy from 76.86% → 83.52% across a 100K+ review corpus by tuning CNN, LSTM, and SVM architectures. Published at IEEE CICN 2023.
    • Co-authored two Springer chapters on information retrieval asymmetric KL-divergence query expansion, and a Modified Cross-Encoder for two-stage passage ranking (nDCG@10 = 0.77).
    PyTorchTensorFlowARIMALSTMBERTscikit-learnStatsmodels
  • Research Intern

    Deakin University (Remote, advised by Prof. Gang Li) · Australia

    Oct 2023 Dec 2023
    • Trained LSTM and Transformer baselines for Tourism Demand Forecasting over multi-year visitation time-series.
    • Built CNN-based Animal Face Detection for endangered species ID, supporting wildlife conservation efforts.
    • Explored Spectral-Spatial Fusion CNNs on hyperspectral imagery for Land Cover Classification.
    PyTorchCNNsTransformersHyperspectralGIS
  • Software Engineering Intern

    Insignia Consultancy Solutions (Remote) · Texas, USA

    Aug 2023 Oct 2023
    • Cut LLM fine-tuning data prep time 68% (hours → minutes) with parallelized Python + TensorFlow ETL 3.2× throughput over the sequential workflow.
    • Replaced a manual content curation step with a TF-IDF + GloVe keyword extraction service that indexed 500+ articles/month into MongoDB.
    • Shipped a Django + PostgreSQL health-topic analytics platform; personalized feeds drove a 20% engagement lift in the first month.
    • Integrated ChatGPT API with BeautifulSoup-scraped sources to lift monthly article output 50% with no added headcount.
    PythonTensorFlowDjangoPostgreSQLMongoDBChatGPT API
  • Software Development Engineer Intern

    CHEARS Organization · Madhya Pradesh, India

    May 2022 Aug 2022
    • Designed a normalized MySQL schema with indexed foreign-key relationships supporting 40,000+ patient records at sub-second read latency under concurrent load.
    • Built and deployed RESTful APIs over HTTP/TCP consumed by React-based admin dashboards; owned the system from schema design through cloud production deployment.
    • Implemented role-based authentication via cloud identity services; cut evaluation workflow time 30% by replacing manual entry with automated ingestion forms.
    JavaNode.jsMySQLReactREST APIsCloud Auth
  • Web Development Intern

    Oasis Infobyte (Remote) · India

    May 2023 Jun 2023
    • Built interactive web applications across the full stack responsive layouts, API integrations, accessibility-aware components following team coding standards and code-review process.
    HTML / CSSJavaScriptREST APIs
  • Salesforce Developer Intern

    TheSmartBridge · India

    Apr 2022 Jun 2022
    • Built Lightning Web Components and REST API integrations enabling dynamic UI updates and real-time data sync across enterprise platforms.
    • Wrote Apex triggers and batch processors; configured Salesforce Org Hierarchy, security, and access control for scalable enterprise workflows.
    • Automated business operations with Flows, Workflows, and Approval Processes across CRM operations.
    ApexLightning Web ComponentsSalesforce CLIREST APIs
// stack

Skills

The tools I reach for, grouped by what I use them for.
AI / ML
PyTorch TensorFlow Transformers Computer Vision NLP Generative AI Self-Supervised scikit-learn XGBoost
Languages
Python C C++ SQL Java JavaScript
Systems / Infra
Linux Pthreads GDB POSIX GPU / CUDA Distributed Git
Backend
FastAPI Flask Node.js Express Spring Firebase
Data
pandas NumPy Matplotlib Statistics Feature Eng.
Frontend
HTML CSS React React Native
// projects

Projects

25 public repos across systems, deep RL, ML, NLP, CV, and data. Filter by category.

Virtual Memory Manager

CLinuxPOSIX signalsGDB

Userspace VMM simulating OS demand paging on limited frames. Intercepts SIGSEGV via sigaction, extracts faulting addresses from ucontext_t, toggles mprotect(). FIFO and Third-Chance page replacement.

Multithreaded OS Scheduler

CPthreadsLinuxGDB

Uniprocessor scheduler with FCFS, preemptive SRTF, and 5-level MLFQ (5–25ms quanta). Thread-safe semaphores from scratch via pthread_mutex and condition variables fully passive-wait, zero spin-locks.

Custom Heap Allocator

CLinuxMemory Mgmt

Buddy + Slab allocators on a custom heap power-of-two partitioning, recursive merge, kernel-style Slab Descriptor Tables. my_malloc / my_free with embedded metadata and user-aligned pointers.

IoT Sensor Data Analysis

Pythonpandasscikit-learnIoT

End-to-end IoT sensor data processing and ML pipeline ingestion, cleaning, feature extraction, and predictive modeling on time-series data across multiple iterative development stages.

Deep RL Homeworks IST 597

PythonPyTorchGymnasiumStable-Baselines3

Full IST 597 Deep RL course MDPs, Bellman, REINFORCE, PPO, Actor-Critic, VDN multi-agent, DQN+ICM exploration, offline behavioral cloning. Unified runnable notebook covering all 3 homeworks and the final project.

Deep RL News Recommendation

PythonPyTorchDQNMIND dataset

Personalized news recommendation using Deep RL on MINDsmall. Compared MC, Q-Learning, SARSA, DQN, and DQN+ICM curiosity-driven exploration significantly improved cold-start recommendation performance.

Masked Time-Series Transformers

PythonPyTorchBERTOptuna

6-layer Transformer encoder pre-trained on 3,883h of unlabeled IMU data with 15% masked patches, fine-tuned for forecasting + HAR. 25.5% MAE reduction, 72.6% accuracy across 6 activities, F1 0.91 on Sleep.

Cardiovascular Disease Classification

Pythonscikit-learnXGBoostpandas

Binary and multi-class classification of cardiovascular disease severity replication of ML fusion approach. Compared Logistic Regression, Random Forest, Gradient Boosting, SVM, and ensemble fusion.

Child Welfare ML Telangana

Pythonpandasscikit-learnmatplotlib

ML analysis of child welfare indicators across 33 Telangana districts using government ICDS data (2020–2021). Predicted wasting, stunting, and underweight from climate, infrastructure, and agricultural features.

News Category Prediction

Pythonscikit-learnTF-IDFNLP

Text classification pipeline for news article category prediction TF-IDF and embedding features, Logistic Regression, Naive Bayes, and deep classifier comparison with full evaluation metrics.

Fish Life Span Prediction

Pythonpandasscikit-learnmatplotlib

Regression ML predicting fish species life span from biological and ecological features size, habitat, diet. Random Forest and Gradient Boosting with RMSE/MAE/R² evaluation.

Q-Learning Tic-Tac-Toe

PythonQ-LearningRLNumPy

Q-learning agent trained to play Tic-Tac-Toe via epsilon-greedy self-play. Q-table convergence analysis, win-rate evaluation against random and greedy opponents.

Skin Cancer Detection via Image Segmentation

PythonSVMCNNHOG

One-Class SVM + CNN on 1K+ dermoscopy images. HOG feature extraction cut false positives 30%; tuned classifiers improved early-detection 20% on 500+ test images.

Smoking Detection Multi-Model CNN

PythonTensorFlowMobileNetV2EfficientNetB3River

Three-model smoking detection: MobileNetV2 transfer learning baseline, EfficientNetB3 fine-tuned for accuracy, and River online learning for incremental streaming inference no full retraining.

Faulty Science Question Classifier

PythonBERTRoBERTaNLP

Fine-tuned BERT and RoBERTa to detect logically flawed science questions. Up to 91% accuracy with error analysis on ambiguity and subtle logical failures across disciplines.

LLM Output Classifier

PythonBERTTransformersNLP

BERT classifier identifying which LLM generated a given text completion. Trained on multi-model completions with feature analysis on stylistic fingerprints across GPT, Claude, and open-source models.

SEARLE Zero-Shot Composed Image Retrieval

PythonCLIPTransformersTextual Inversion

Replication of Zero-Shot Composed Image Retrieval with Textual Inversion. Combines image + text modification query in CLIP embedding space via pseudo-word textual inversion no training pairs needed.

ModiCross: Two-Stage Document Ranking

PythonTransformersCrossEncoder

Modified CrossEncoder with novel scoring (logits + cosine similarity on [CLS]) on 96K query-doc pairs. Hard negative mining + combined loss → +0.26 nDCG@10 and 50%+ ranking precision lift.

Friends Chatbot

PythonTransformersGradioPyTorch

Character chatbot fine-tuned on Friends TV show dialogue. Select Ross, Rachel, Monica, Chandler, or Joey and chat in their style Gradio UI with character portraits and themed background.

Canvas RAG Chatbot

PythonFAISSGradioRAG

RAG chatbot serving 50+ students recursive PDF chunking, sentence-transformer embeddings, FAISS ANN retrieval. 35% answer accuracy lift via prompt and role-based tuning.

Flight & Weather Delay Analysis

Pythonscikit-learnpandasRandom Forest

Predicting US flight delays by fusing 725K flight records with NOAA hourly weather. Cyclical sin/cos time encoding, traffic proxies, Random Forest model. 3.2× delay multiplier simulated in severe winter weather.

Gun Deaths Analysis & Clustering

Pythonpandasscikit-learnseaborn

Statistical analysis of US gun death data with agglomerative hierarchical clustering. Trends by year, age, race, intent, and place dendrogram and demographic group interpretation.

CHEARS Internship Data Analysis

Pythonpandasscikit-learnMySQL

Data science work from CHEARS internship EDA, feature engineering, and ML modeling on healthcare data supporting 40,000+ patient records with sub-second read latency under concurrent load.

Travel Management System

JavaNode.jsMongoDBHTML/CSS

Full-stack travel platform with admin dashboard, secure payments, and AI-driven trip recommendations. Real-time itinerary management, booking history, and customer inquiry support.

Tourism Web Design Project

PHPHTMLCSSJavaScript

PHP-based tourism website with destination listings, travel package browsing, and hotel/tour booking form. Responsive design with image sliders and dynamic package pages.

// research

Publications

Peer-reviewed work across forecasting, sentiment analysis, and deep learning.
  • Bert_DPRCNN: A Novel Approach to Text Classification Using Hybrid Deep Learning Models

    CAINE 2025 · Springer CCIS vol. 2740 · 2026Hybrid transformer + CNN + RNN + LSTM model for text classification. 94.54% accuracy on Chinese benchmarks, 91.89% on English.

    View paper
  • Improving Information Retrieval with Asymmetric Divergence from Randomness in Query Expansion

    ICTIS 2025 · Springer LNNS vol. 1518 · 2026Tackles vocabulary mismatch in IR using Asymmetric KL Divergence for query expansion. Evaluated against Vanilla-BERT and Kernel Neural Ranking Model on large real-world datasets.

    View paper
  • Modified Cross Encoder for Two-Stage Passage Ranking

    ICTIS 2025 · Springer LNNS vol. 1518 · 2026Two-stage cross-encoder reranker with a novel scoring head (logits + cosine similarity on [CLS] embeddings). nDCG@10 of 0.77, beats BM25, SparseBiEncoder, DenseBiEncoder, and base cross-encoder.

    View paper
  • Enhancing Forecasting Accuracy with a Moving Average-Integrated Hybrid ARIMA-LSTM Model

    SN Computer Science · Springer · 2024Hybrid ARIMA + LSTM with a moving-average integration step. Improved time-series forecasting accuracy over either model alone.

    View paper
  • Advancements in Sentiment Analysis: A Deep Learning Approach

    IEEE Conference · 2024Deep learning techniques applied to sentiment classification across multiple datasets, comparing architectures and training regimes.

    View paper
  • Comparative Study on Sentiment Analysis Using Machine Learning Techniques

    Mehran University Research Journal of Engineering & TechnologyComparative analysis of classical ML models for sentiment classification accuracy, recall, and runtime trade-offs across model families.

    View paper
// achievements

Recognition

Awards and selective programs.

Gold Medal · AI/ML, 5th Research Day

SRM University AP awarded for research presentation in the AI/ML category on the university's annual Research Day.

Hatch Lab Member · SRMAP E-Cell

Collaborated with an early-stage startup incubator: business modeling, problem-solving workshops, and market validation.

Best Entrepreneur Award · AMEYA 2K23

Paari School of Business, SRM University AP recognized for a scalable, problem-solving business model competing across institutions.

// community

Beyond the code

Leadership and community work alongside the technical track.
  • Co-Founder · Volta. Tech-driven initiative encouraging students to innovate and ship hackathons, ideation workshops, networking sessions.
  • Team Lead · E-Cell, SRM University AP. Ran entrepreneurship and product-building workshops reaching 3,000+ students; led pitch sessions, industry visits, and mentorship.
  • Member · Elon Fellowship, SRM University AP. Translated academic research into viable startups; identified projects with commercial potential.
  • IR Council Member · International Relations, SRM University AP. Workshops, webinars, and cultural exchange to expand global student outreach.
  • Core Member · IGSA, Penn State. Planning and hosting cultural and networking events for Indian graduate students; supporting incoming-student transition and cross-cultural dialogue.
// contact

Let's talk

Fastest way to reach me is email. I respond within a day.
Location
State College, PA
Office hours · Thu 5PM
Yogeshvar Reddy Kallam graduation

PSU Class of 2026 🎓🎉

WE ARE — PENN STATE!