About Journey Skills Career Projects Articles Contact
Technical Head · UPES-CSA · Open to Internships

Rudra
Gupta

// AI/ML Developer & Full-Stack Engineer

B.Tech CS (AI/ML) student at UPES, Dehradun building LLMs, RAG pipelines, and scalable web systems. Currently leading tech at UPES-CSA. I turn research into real products.

0
Projects Built
0
Members Mentored
rudra@upes ~ portfolio
whoami
Rudra Gupta // AI/ML Developer
cat skills.txt
Python · PyTorch · LoRA · RAG · FAISS
React.js · Node.js · Docker · AWS
cat current_focus.txt
Edu-SLM // LLaMA 3.1 + LoRA fine-tuning
echo $status
Available for internships
01 — about me

Who I Am

Builder, mentor, and club leader. I ship AI systems that actually work.

Background
Fine-tuning models, shipping code, leading teams.

B.Tech CS (AI/ML) at UPES, Dehradun. I spend my time between PyTorch notebooks and production deploys, with a focus on LLM fine-tuning and RAG systems.

As Technical Head at UPES-CSA, I run the dev team behind upescsa.in, mentor 10+ members, and organize Hackathon 4.0 and AWS Community Day Dehradun 2025.

By The Numbers
5+
Projects Shipped
10+
Members Mentored
4
Events Organized
Education
UPES, Dehradun

B.Tech Computer Science & Engineering with specialization in Artificial Intelligence & Machine Learning.

2023 — 2027 AI/ML Specialization Dehradun, India
Current Focus
Edu-SLM

Curriculum-aligned language model using LoRA fine-tuning on LLaMA 3.1 8B with RAG-based retrieval grounding.

LLaMA 3.1 LoRA RAG FAISS
Resume Highlights
01
Top 5 finalist — SIH Internal Hackathon 2025
02
Led AWS Community Day Dehradun 2025 end-to-end
03
Deployed production MERN app on AWS with Docker + Nginx
04
Published semantic QA system over Bhagavad Gita using Llama 3
02 — my journey

The Climb

From curious freshman to technical head — every step shaped how I build and lead.

2023
The Beginning
B.Tech CS (AI/ML) · UPES, Dehradun
Started my journey into computer science with a focus on AI/ML. Discovered a passion for building intelligent systems that solve real problems.
Jun — Jul 2024
First Impact
S.O.S. International (Srijan) · Jammu
Worked with underprivileged communities, learning the value of technology as a force for social good.
Jun 2024
Stepping Up
Associate Technical Head · UPES-CSA
Took on leadership — mentoring 10+ members, contributing to the UPES-CSA platform, and supporting hackathons and workshops.
Jun — Jul 2025
Industry Exposure
Web Developer Intern · Pi Craft · Remote
Built production-grade React.js components, optimized UI performance, and learned to ship code that real users depend on.
Apr 2025
Leading The Charge
Technical Head · UPES-CSA
Leading development of upescsa.in, organizing Hackathon 4.0, AWS Community Day Dehradun 2025, and mentoring the next generation.
2025 — Present
Building The Future
Edu-SLM · Research · LLM Fine-Tuning
My biggest project — a curriculum-aligned language model combining LoRA fine-tuning with RAG retrieval. Turning research into a product.
03 — technical skills

Skills & Stack

From LLM fine-tuning pipelines to production deployments — full-spectrum AI/ML engineering.

{}
Languages
5
Python JavaScript TypeScript Java C / C++
AI
AI / ML / LLM
10
PyTorch LoRA Fine-Tuning RAG FAISS Hugging Face Computer Vision NLP TensorFlow Scikit-learn YOLO
W
Web Development
8
React.js Tailwind CSS Node.js REST APIs Express.js FastAPI Streamlit Next.js
D
Data & Cloud
8
Git / GitHub MongoDB Docker AWS MySQL PostgreSQL Nginx Redis
04 — work & leadership

The Ladder

Each rung represents a leap — from intern to leader, from learning to building.

Web Developer Intern
Pi Craft · Remote
Jun — Jul 2025
  • Built responsive React.js components for production web apps
  • Optimized UI performance across multiple application modules
1
2
Associate Technical Head
UPES-CSA
Jun 2024 — Apr 2025
  • Mentored 10+ members in development practices
  • Contributed to the UPES-CSA platform build
  • Supported hackathons, workshops, and competitions
Technical Head
UPES-CSA
Apr 2025 — Present
  • Led development of upescsa.in with AWS deployment
  • Organized Hackathon 4.0, AWS Community Day 2025
  • Leading cross-functional dev and event operations
3
05 — projects

Featured Projects

End-to-end AI systems, NLP applications, and full-stack platforms shipped to production.

Live
W3

UPESCSA.in — Official Club Website

Official website for UPES-CSA enabling event registrations and dynamic content. Deployed on AWS using Docker and Nginx.

MERN StackDocker NginxAWS
Visit upescsa.in →
Live
Gt

TheGeetaWay — AI Bhagavad Gita Portal

Semantic search and Q&A over the Bhagavad Gita using vector-based retrieval with Llama 3.

Llama 3FAISS FastAPIStreamlit
Visit Live App →
Live
ASL

ASL Recognition — Real-Time Sign Language

Real-time ASL recognition using YOLO alphabet detection and MediaPipe hand tracking. 0.99+ precision.

YOLOMediaPipe OpenCVPython
View on GitHub →
Live
ML

TempPrediction — Weather Forecasting ML

Temperature prediction model using Random Forest regression. R² score of 0.94 for short-term forecasting.

Random ForestScikit-learn PandasStreamlit
View on GitHub →
06 — achievements

Recognition & Events

Competitive wins and flagship events I've led from conception to execution.

🏆
Top 5 — SIH Internal Hackathon 2025
Selected among top 5 teams at the Smart India Hackathon internal round at UPES.
☁️
AWS Community Day Dehradun 2025
Led end-to-end execution of this large-scale AWS community conference as Technical Head.
💻
Hackathon 4.0 — Lead Organizer
Organized and executed UPES-CSA's flagship Hackathon 4.0 from logistics to judging.
Azure Cloudscape & Entropedia 2.0
Delivered two large-scale cloud and tech events, ensuring smooth end-to-end operations.
07 — writing

SEO Articles

Deep-dive technical writing in my niche — LLM engineering and AI for developers.

08 — contact

Get In Touch

Open to AI/ML internships, full-stack roles, research collaborations, and open-source contributions.

Whether you're a recruiter looking for an AI developer, a researcher wanting to collaborate on LLM projects, or someone who wants to chat about building intelligent systems — reach out.

HomeArticles → LoRA Fine-Tuning Guide

How to Fine-Tune LLMs with LoRA: A Practical Guide for AI/ML Students (2026)

Rudra Gupta May 1, 2026 ⏱ 10 min read 🏷 LLM · LoRA · Fine-Tuning · PEFT
📌 Quick Answer — Recommended Article

LoRA (Low-Level Adaptation) allows you to train a pre-trained LLM on your own dataset, with a high standard of <1% training efficiency. By introducing random matrices into the model logic – but disabling seeding – you can train a language with 8 billion object patterns on a single GPU (like Google Colab T4) in a matter of hours. By
2025, tools like Unsloth and PEFT will make this process possible even for AI students with limited virtual budgets.

When I started studying fundamental language systems at UPES, the idea of

preparing for an LLM seemed overwhelming. Textbooks said you needed

large distributed clusters, A100 GPUs, and days of computation.

Reality? Precise Functional Engineering (PEFT) is making AI

development mainstream.

In this tutorial, I will explain the mathematics of LLaMA optimization using LoRA, explain why it works, and guide you through the development of a script to train the first open source framework.

Table of Contents

Problem: Full Optimization in 2025

What is LoRA Optimization? (Concept)

Math: Low-Level Analysis

Modern Aesthetic Style (Face Hugging)

Dataset Optimization

Method: LLaMA 3 Optimization

Frequently Asked Questions (SEO FAQ)

Problem: Full Optimization in 2025

Imagine you have an open source system like LLaMA 3 (8B variables). You want to train a specific skill on your database - for example, translating natural language into SQL queries.

A "full optimization" (FFT) involves calculating the rise of the optimizer's behavior for all 8 billion instances. To calculate this:

Model weight (bf16/fp16): ~16 GB video memory
Modifications: ~16 GB video memory
Optimization state (AdamW): ~32 GB video memory

Throughput/cluster size: ~10 GB video memory

You need at least 80 GB video memory. An 80 GB A100 GPU costs about $2-3 per hour. This is a hurdle for machine learning students.

What is LoRA optimization? (Concept)

LoRA (Low-order Adaptation) introduced by Hu et al. (Microsoft Research)

has chosen a sensible path. Since the initial examples are already

fulfilled with other standards, the "new" requirements for learning new skills in the initial inner region

are lower.

Instead of changing the original algorithm, LoRA does the following:

Disables all initial weights of the previously trained model.

Injects a small trainable decomposition with attention variables (usually query variable/attention value).

You just trained this input matrix.

Feedback? You trained 10 million variables instead of 8 billion. The video memory requirements have decreased from 80 GB to about 8 GB, which is very close to the Free Colab GPU.

Math: Low-level decomposition
Let's consider a formal progression:

h = W₀x

where W₀ is the trained weight

d × k.

In LoRA, we represent the adaptive weight ΔW by dividing

into two subsets, A and B:

ΔW = BA

where:

The size of B is d × r

The ratio of A is r × k

r is the rank (i.e. 8, 16, 32). These

are the selected supervariables.

This step is:

h = W₀x + BAx

After a while, to get an idea,

you can add the fractions together: W_new = W₀ + BA. This is called

a "batch".

Modern scheduling method

In 2025, you won't have to write a course from scratch. ML engineers

Here, to optimize LLM, they use:

Hug FaceTransformation: the first architecture for model libraries

.
PEFT: a library for parsing and injecting LoRA adapters.

TRL (Transformer Reinforcement Learning): handles SFT

(controlled precision retention) and retention.

BitsAndBytes: handles 4-bit and 8-bit quantization (QLoRA)

for loading models into VRAM.

Unsloth: An optimized front-end framework that speeds up LoRA learning

by up to 2x the read step and saves memory.

Dataset setup

Supervised replication requires that your data be replicated

in a learning state of "when to stop talking". The most common ChatML format is:

<|im_start|>user
Write a SQL query to find users over 30 years old.<|im_end|>
<|im_start|>helper
SELECT * FROM operator WHERE years >gt; 30;<|im_end|>

Method: LLaMA 3 Optimization
Here is a script that shows how simple this API is:

Import FastLanguageModel from unsloth
Install torch

max_seq_length = 2048
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "unsloth/llama-3-8b-bnb-4bit",
max_seq_length = max_seq_length,
load_in_4bit = True,
))

# LoRA metric implementation.
model = FastLanguageModel.get_peft_model(
model,
r = 16, # low-level matrix levels.
target_modules = ["q_proz", "k_proz", "v_proz", "o_proz",
"get_proz", "up_proz", "down_proz"],
lora_alpha = 16, 3.
lora_dropout = 0 , 0 .
bias = "nothing",
use_gradient_checkpointing = "unsloth",
)

# ... (configure the trainer with the TRL SFTTrainer, pass the ChatML dataset, and call trainer.train()))

Lora Tuning FAQs.

What is the difference between LoRA and QLoRA?

LoRA optimizes low-level variables, but keeps the original model fixed at full precision (16 bits). QLoRA (Quantized LoRA) goes one step further by quantizing the base model to 4-bit precision. QLoRA significantly reduces memory requirements and is becoming the standard for consumer graphics cards.

What value should I set for my LoRA rank (r)?

A higher r value means more trainable variables, capable of learning more complex tasks, but more expensive in terms of memory and computing power. For simple brightness variations, r=8 is often enough. For complex logic (like programming or deep learning) r=32 or r=64 is better.

Can LoRA cause catastrophic forgetting?

Because LoRA tightly locks the base model, the possibility of catastrophic forgetting is much greater than with complete fine-tuning. However, when trained on very specific data over a long period of time, adapters can overfit and produce unintelligible results when this narrow domain constraint is removed.

Have questions about this guide? Contact me on LinkedIn or visit my GitHub profile.

HomeArticles → LoRA Fine-Tuning

How to Fine-Tune LLMs with LoRA (with Code Examples)

Rudra Gupta April 1, 2026 ⏱ 10 min read 🏷 LoRA · PEFT · LLaMA 3 · Fine-Tuning · LLM Architecture
📌 Quick Answer — Featured Snippet

LoRA at a glance: LoRA (Low-Rank Adaptation) lets you fine-tune a large language model by training only a tiny set of injected weight matrices — as few as 0.05% of total parameters — while the base model stays frozen.

What is LoRA and Why Does It Matter?

Fine-tuning a large language model used to require massive GPU resources. LoRA solves this by enabling parameter-efficient fine-tuning.

How LoRA Works

W_new = W + B × A

Step-by-Step Fine-Tuning

Step 1: Setup

pip install transformers peft accelerate bitsandbytes datasets trl

Step 2: Load Model

from transformers import AutoModelForCausalLM

Step 3: Apply LoRA

from peft import LoraConfig

Step 4: Train

trainer.train()

Hyperparameters

Parameter Range
r 8–64
learning_rate 1e-4 – 3e-4

Combining LoRA with RAG

Combine LoRA with RAG to achieve both stylistic control and factual grounding.

FAQ

What rank should I use? Start with r=8 or r=16.

How much GPU memory? 12–16 GB with QLoRA.