ARCHIVES
Original Article
Agentic AI Based Smart Assistant: A Multimodal Visual Question Answering System Using Fast API and GROQ Vision-Language Models
Gaurav Arya1
Shuchi Sharma2
1 Student, Department of AIML, ADGIPS, FC-26 Shastri Park, Shahdara, New Delhi, India. 2 Assistant Professor, Department of AIML, ADGIPS, FC-26 Shastri Park, Shahdara, New Delhi, India.
Published Online: May-August 2026
Pages: 237-240
Cite this article
↗ https://www.doi.org/10.59256/indjcst.20260502026References
1. S. Antol et al., "VQA: Visual Question Answering," in Proc. ICCV, 2015, pp. 2425–2433.
2. P. Anderson et al., "Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering," in Proc. CVPR, 2018, pp.
6077–6086.
3. A. Radford et al., "Learning Transferable Visual Models from Natural Language Supervision," in Proc. ICML, 2021, pp. 8748–8763.
4. J. Li et al., "BLIP-2: Bootstrapping Language-Image Pre-training," in Proc. ICML, 2023.
5. H. Liu et al., "Visual Instruction Tuning," in Proc. NeurIPS, 2023.
6. P. Lewis et al., "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks," in Proc. NeurIPS, 2020.
7. N. Reimers and I. Gurevych, "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks," in Proc. EMNLP, 2019.
8. S. Tiangolo, "FastAPI Documentation." [Online]. Available: https://fastapi.tiangolo.com/ (Accessed 2025).
9. Groq Inc., "Groq API Documentation." [Online]. Available: https://console.groq.com/docs (Accessed 2025).
10. C. Manning and D. Jurafsky, Speech and Language Processing, 3rd ed. Stanford University, 2021.
2. P. Anderson et al., "Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering," in Proc. CVPR, 2018, pp.
6077–6086.
3. A. Radford et al., "Learning Transferable Visual Models from Natural Language Supervision," in Proc. ICML, 2021, pp. 8748–8763.
4. J. Li et al., "BLIP-2: Bootstrapping Language-Image Pre-training," in Proc. ICML, 2023.
5. H. Liu et al., "Visual Instruction Tuning," in Proc. NeurIPS, 2023.
6. P. Lewis et al., "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks," in Proc. NeurIPS, 2020.
7. N. Reimers and I. Gurevych, "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks," in Proc. EMNLP, 2019.
8. S. Tiangolo, "FastAPI Documentation." [Online]. Available: https://fastapi.tiangolo.com/ (Accessed 2025).
9. Groq Inc., "Groq API Documentation." [Online]. Available: https://console.groq.com/docs (Accessed 2025).
10. C. Manning and D. Jurafsky, Speech and Language Processing, 3rd ed. Stanford University, 2021.
Related Articles
2026
Artificial Intelligence in Learning and Teaching
2026
Admin Assist: An AI – Driven Configuration and Orchestration for Enterprise Application
2026
Enhancing Blood Group Identification using pigeon inspired optimization: An Innovative Approach
2026
Eco-Genius: Power Up Smart, Power Down Waste
2026
Crowd-Sourced Disaster Response and Rescue Assistant
2026