📙 in a multi-part sequence on creating net purposes with generative AI integration. Half 1 centered on discussing the AI stack and why the applying layer is the most effective place within the stack to be. Verify it out here. Half 2 centered on why Ruby is the most effective net language for constructing AI MVPs. Verify it out here. I extremely advocate you learn via each elements earlier than studying this text to get caught up on terminology used right here.
Desk of Contents
Introduction
On this article, we shall be conducting a enjoyable thought experiment. We search to reply the query:
How easy can we make an online software with AI integration?
My readers will know that I worth simplicity very highly. Easy net apps are simpler to know, sooner to construct, and extra maintainable. In fact, because the app scales, complexity arises out of necessity. However you all the time wish to begin easy.
We’ll take a typical case examine for an online software with AI integration (RAG), and have a look at 4 completely different implementations. We’re going to start with probably the most complicated setup that’s composed of the preferred instruments, and try and simplify it step-by-step, till we find yourself with the most straightforward setup attainable.
Why are we doing this?
I wish to encourage builders to assume extra merely. Oftentimes, the “mainstream” path to constructing net apps or integrating AI is way too complicated for the use case. Builders take inspiration from firms like Google or Apple, with out acknowledging that instruments that work for them are oftentimes inappropriate for shoppers working at a a lot smaller scale.
Seize a espresso or tea, and let’s dive in.
Degree 1: As Advanced As It Will get
Suppose a consumer has requested you to construct a RAG software for them. This software may have one web page the place customers can add their paperwork and one other web page the place they’ll chat with their paperwork utilizing RAG. Going with the preferred net stack presently in use, you resolve to go along with the MERN stack (MongoDB, Specific.js, React, and Node.js) to construct your software.
To construct the RAG pipelines that shall be dealing with doc parsing, chunking, embedding, retrieval, and extra, you once more resolve to go along with the preferred stack: LangChain deployed by way of FastAPI. The online app will make API calls to the endpoints outlined in FastAPI. There’ll must be at the least two endpoints: one for calling the indexing pipeline and one other for calling the question pipeline. In observe, additionally, you will want upsert and delete endpoints, to make sure that the information in your database stays in sync with the embeddings in your vector retailer.
Word that you can be utilizing JavaScript for the online software, and Python for the AI integration. This duo-lingual app means you’ll possible be utilizing a Microservices structure (see part 2 of this sequence for extra on this). This isn’t a strict requirement, however is commonly inspired in a setup like this.
There’s another option to be made: what vector database will you be utilizing? The vector database is the place the place you retailer the doc chunks created by the indexing pipeline. Let’s once more go along with the preferred selection on the market: Pinecone. This can be a managed cloud vector database that many AI builders are presently utilizing.
The entire system may look one thing like the next:
Yikes! There are a number of transferring items right here. Let’s break issues down:
- On the backside rectangle, now we have the online software and MongoDB backend. Within the center now we have the RAG pipelines constructed with LangChain and FastAPI. On the prime, now we have the Pinecone vector database. Every rectangle right here represents a distinct service with their very own separate deployments. Whereas the Pinecone cloud vector database shall be managed, the remainder is on you.
- I’ve wrapped instance HTTP requests and corresponding responses with a dotted border. Keep in mind, this can be a microservices structure, so this implies HTTP requests shall be wanted anytime inter-service communication happens. For simplicity, I’ve solely illustrated what the question pipeline calls would appear to be and I’ve omitted any calls to OpenAI, Anthropic, and many others. For readability, I numbered the requests/responses within the order during which they might happen in a question state of affairs.
- For example one ache level, guaranteeing the paperwork in your MongoDB database are synced with their corresponding embeddings within the Pinecone index is doable however may be tough. It takes a number of HTTP requests to go out of your MongoDB database to the cloud vector database. This can be a level of complexity and overhead for the developer.
A easy analogy: that is like making an attempt to maintain your bodily bookshelf synced up with a digital ebook catalog. Any time you get a brand new ebook or donate a ebook out of your shelf (seems you solely just like the Sport of Thrones present, not the ebook), it’s important to go and manually replace the catalog to mirror the change. On this world of books a small discrepancy gained’t actually affect you, however on this planet of net purposes this could be a massive drawback.
Degree 2: Drop the Cloud
Can we make this structure any easier? Maybe you learn an article just lately that mentioned how Postgres has an extension known as pgvector. This implies you may forgo Pinecone and simply use Postgres as your vector database. Ideally you may migrate your information over from MongoDB so that you simply stick with just one database. Nice! You refactor your software to now appear to be the next:

Now we solely have two providers to fret about: the online software + database and the RAG pipelines. As soon as once more, any calls to mannequin suppliers has been omitted.
What have we gained with this simplification? Now, your embeddings and the related paperwork or chunks can dwell in the identical desk in the identical database. For instance, you may add an embeddings column to a desk in PostgreSQL by doing:
ALTER TABLE paperwork
ADD COLUMN embedding vector(1536);
Sustaining coherence between the paperwork and embeddings must be a lot easier now. Postgres’ ON INSERT/UPDATE
 triggers allow you to compute embeddings in-place, eliminating the two-phase “write doc/then embed” dance totally.
Returning to the bookshelf analogy, that is like ditching the digital catalog and as a substitute simply attaching a label straight to each ebook. Now, while you transfer round a ebook or toss one, there isn’t a must replace a separate system, because the labels go wherever the books go.
Degree 3: Microservices Begone!
You’ve completed a superb job simplifying issues. Nevertheless, you assume you are able to do even higher. Maybe you may create a monolithic app, as a substitute of utilizing the microservices structure. A monolith simply implies that your software and your RAG pipelines are developed and deployed collectively. A difficulty arises, nonetheless. You coded up the online app in JavaScript utilizing the MERN stack. However the RAG pipelines had been constructed utilizing Python and LangChain deployed by way of FastAPI. Maybe you may attempt to squeeze these right into a single container, utilizing one thing like Supervisor to supervise the Python and JavaScript processes, however it isn’t a pure match for polyglot stacks.
So what you resolve to do is to ditch React/Node and as a substitute use Django, a Python net framework to develop your app. Now, your RAG pipeline code can simply dwell in a utility module in your Django app. This implies no extra HTTP requests are being made, which removes complexity and latency. Any time you wish to run your question or indexing pipelines all it’s important to do is make a perform name. Spinning up dev environments and deployments is now a breeze. In fact, in the event you learn half 2, our choice is to not use an all Python stack, however as a substitute go along with an all Ruby stack.
You’ve simplified even additional, and now have the next structure:

An vital notice: in earlier diagrams, I mixed the online software and database right into a single service, for simplicity. At this level I feel it’s vital to point out that they’re, actually, separate providers themselves! This does not imply you’re nonetheless utilizing a microservices structure. So long as the 2 providers are developed and deployed altogether, that is nonetheless a monolith.
Wow! Now you solely have a single deployment to spin up and preserve. You possibly can have your database arrange as an adjunct to your net software. This sadly means you’ll nonetheless possible wish to use Docker Compose to develop and deploy your database and net software providers collectively. However with the pipelines now simply working as capabilities as a substitute of a separate service, now you can ditch FastAPI! You’ll not want to take care of these endpoints; simply use perform calls.
A little bit of technical element: on this chart, the legend signifies that the dotted line is not HTTP, however as a substitute a Postgres frontend/backend protocol. These are two completely different protocols on the software layer of the internet protocol model. This can be a completely different software layer than the one I mentioned in part 1. Utilizing an HTTP connection to switch information between the applying and the database is theoretically attainable, however not optimum. As an alternative the creators of Postgres created their very own protocol that’s lean and tightly coupled to the wants of the database.
Degree 4: SQLite Enters the Chat
“Absolutely we’re completed simplifying?”, chances are you’ll be asking your self.
Unsuitable!
There’s another simplification we are able to make. As an alternative of utilizing Postgres, we are able to use SQLite! You see, presently your app and your database are two separate providers deployed collectively. However what in the event that they weren’t two completely different providers, however as a substitute your database was only a file that lives in your software? That is what SQLite may give you. With the just lately launched sqlite-vec
 library, it may even deal with RAG, identical to how pgvector
 works for Postgres. The caveat right here is that sqlite-vec
 is pre-v1, however that is nonetheless nice for an early stage MVP.

Actually wonderful. Now you can ditch Docker Compose! That is actually a single service Web Application. The LangChain modules and your database now all are simply capabilities and information residing in your repository.
Involved about the usage of SQLite in a manufacturing net software? I wrote recently about how SQLite, as soon as thought-about only a plaything on this planet of net apps, can grow to be production-ready via some tweaks in its configuration. In reality Ruby on Rails 8 just lately made these variations default and is now pushing SQLite as a default database for new applications. In fact because the app scales, you’ll possible must migrate to Postgres or another database, however bear in mind the mantra I discussed to start with: solely introduce complexity when completely essential. Don’t assume your app goes to explode with hundreds of thousands of concurrent writes when you find yourself simply making an attempt to get your first few customers.
Abstract
On this article, we began with the normal stacks for constructing an online software with AI integration. We noticed the quantity of complexity concerned, and determined to simplify piece by piece till we ended up with the Platonic best of easy apps.
However don’t let the simplicity idiot you; the app remains to be a beast. In reality, due to the simplicity, it may run a lot sooner than the normal app. In case you are noticing that the app is beginning to decelerate, I might attempt sizing up the server earlier than contemplating migrating to a brand new database or breaking apart the monolith.
With such a lean software, you may actually transfer quick. Native improvement is a dream, and including new capabilities may be completed at lightning pace. You possibly can nonetheless get backups of your SQLite database utilizing one thing like Litestream. As soon as your app is exhibiting actual indicators of pressure, then transfer up the degrees of complexity. However I counsel towards beginning a brand new software at degree 1.
I hope you could have loved this sequence on constructing net purposes with AI integration. And I hope I’ve impressed you to assume easy, not sophisticated!
🔥 If you’d like a custom web application with generative AI integration, visit losangelesaiapps.com