Harnessing the power of Stable Diffusion.

October 27, 2023

It was March 2023, 8th, final semester. I had this idea of not making something old school, like a so-and-so management system that every student builds and those that have been built by students of previous batches. I wanted to be cool.

The AI boom 🤖

Months prior, OpenAI released their GPT models. The software industry was taken up by storm. The initial (now legacy) LLMs released by OpenAI were impressive and became popular very quickly. Text-davinci-002,text-davinci-003, davinci, curie, babbage, ada were some of the earliest models offered by OpenAI.

I made an account with OpenAI and with the $17 free credit, I tried all the listed LLMs. They felt like magic. The interaction between you and the model felt human. It was unbelievable. I cloned their starter projects and made my first LLM project, Reddit Username Generator. Don't judge me. I was amazed to see it responding to my natural language instruction. The interaction never felt like computerized. A human touch was there. Subtle but it was there.

The Picasso Models 🪄

After the boom of AI models, many companies and organisations released their LLMs. One of them was Stable Diffusion. It was released around the same time as models. It was more magical than the chat models because it was interesting to think how in the world a model can generate or paint images from scratch, given an english input? It was astonishing, something never seen before.

The Ideation 💡

I was well aware of the LLMs. I wanted to try all of them. Thanks to Hugging Face for giving us free apis to play with.

Minertia is a college project. As models were very popular during the time, the fundamental idea of using stable Diffusion to generate images existed already. We were aware of it and decided to consult our mentor. After the discussion, our mentor said we need to make it unique because the project can be marked as plagarized.

I brainstormed some ideas to make it unique. I also wanted to try blockchain and web3 stuff, I saw a opportunity to try both. So I added a feature of minting the generating images as a NFT on OpenSea. It worked great. But with one limitation, only 100 NFTs can be minted for free while using NFTPort's APIs.

So the minting feature will be there till we hit 100 NFTs. After that this feature will be deprecated. And most probably I will archive this project as a memoir.

Tech Stack 💻

The first iteration of this project was built with pure React. NFTPort was used for transporting NFTs to OpenSea. NFT.Storage was used to upload generated images on IPFS so that NFTPort can access it.

On the second iteration, I built it using Next.js. Whole UI was revamped with Shadcn Ui and Tailwindcss, two of the most iconic libraries for modern web development.

The Drawback 💀

The project uses NFTPort which gives a user only 100 NFT tokens that can be minted on blockchain. Due to this, when our app hits a 100 minted tokens, it will stop working 🥲.

Moral of the story 💁

Anyways, I had a lot of fun exploring Stable Diffusion. I will probably make another thing with it in the future. See you in the next blog.