TL;DR: Wrote pdf store app with Gemini. Try it out here: https://pdf-reader-app-latest.onrender.com/, or scroll down to see screenshots.

Motivation

I have been searching for an app that allows me to aggregate pdfs and annotate them. I eventually found one. However, I thought it would be a nice opportunity to give Gemini a try. The goal was less to develop a rigorous agent setup, but see how far an out of the box agent cli could take me. I was impressed with how easy it was to have Gemini kick start an MVP and let me focus on the interaction with the product.

All I did was install Gemini CLI and create an empty folder and start typing 6 letters:

$ gemini

Here are the highlights of my 1-2 hour interaction spread over 3 days:

Day 1: From Scratch (~30 min)

  • Write me a pdf uploader and viewer app. It created one that allows me to provide pdf links, and lists the ones I have previously uploaded. Clicking on them opens up a pdf viewer.
  • Have it take an arXiv abstract link (ex: https://arxiv.org/abs/2210.07897) and extract metadata and download the associated pdf. I also gave it these xpath selectors here to extract metadata. I can now post arXiv links and have them show up.
  • Add keyword search so I can easily (and simply) search for a paper.
  • Render an image thumbnail of the pdf and add it to the list items
  • Add highlighting to the pdf

After asking it back and forth and in between moving on to other things, I had a working app that runs locally by running backend and frontend separately. It felt like working with and checking in on a very knowledgeable colleague 1. Very promising.

Day 2: Add a feature to Existing App (~15-20 min)

  • Add the ability to add scribble annotations to the pdf as a drawing overlaid on top of the pdf

Took a few tries and the feature isn’t great (scribbles don’t follow page), but serves as a good MVP for future iteration.

Day 3: Prepare for Deployment (~5min)

  • Add a dockerfile to allow me to deploy this app. This was just a one liner and then I had to go.

Gemini both wrote a Dockerfile and had the backend serve the frontend to make it easier to deploy.

Day 4 (No Gemini): Manually Deploy to Render (~30 min)

I did some quick searches for places to manually deploy these days. Render seemed like a good choice. I tagged and uploaded my docker image to dockerhub. Deploying the image failed on Render because Gemini had hard-coded some urls which worked locally but did not work in docker. I just updated them.

These were more or less trivial path updates, for example, the path to the compiled frontend files was:

app.mount("/static", StaticFiles(directory="/dist"), name="static")

but should have been:

app.mount("/static", StaticFiles(directory="/app/frontend/dist"), name="static")

as the Dockerfile was copying it there:

COPY --from=frontend-builder /app/frontend/dist ./frontend/dist

The outcome

I deployed the app live at https://pdf-reader-app-latest.onrender.com/. Try it out. It’s still barebones, but for very little time commitment, I think it’s off to a pretty good start.

NOTE: It will take about a minute to load and may not work if there is high traffic; I’m using freebie quota.

You can find the code here 2.

The app currently only has two views, list view and pdf view. If the app does not render, here is the list view (main page) of the app:

List view of the pdf reader app.

And finally, the pdf view:

PDF view of the pdf reader app.

Discussion

During this interaction, the majority of my time was towards thinking about how to get the product working. In the past, I would have devoted more time towards development (reading documentation, searching libraries, writing code). I still did need to devote time to some debugging (read console log output, send to Gemini, or in a small number of cases, search and fix it myself).

I have noticed a few other things I believe worth highlighting. Here are some other takeaways:

  1. Gemini is a software overengineer: Gemini will sometimes choose solution more complex than needed. For example, at first when I asked for Gemini to add highlighting, it wrote its own custom tool. It also kept making mistakes, getting the coordinate conversions wrong, and calling methods for converting them that don’t exist. Surprised nothing existed, I did a simple search and found this plugin. I then insisted that it try to use it and it eventually did (phew, because UI’s are not my forte).
    engineer-engifar
    Credit: https://imgur.com/gallery/before-lock-yK9GRw8
  2. Do not give Gemini git access (or access to anything that is not backed up, really): Gemini sometimes made a bad change and wanted to commit it over an uncommitted working change. I never allowed access and simply committed when I felt the change was ready (and working).
  3. Be specific on the product goal, but loose on the implementation details: This is counter to what many people suggest and is generally good for apps where we are past the MVP and want a specific implementation. However, for MVP’s, you generally don’t know what you don’t know. So let Gemini give you some creative feedback by being vague where the implementation matters less for your intended goal. For example, I wanted to add a drawing feature on top of the pdf viewer. I didn’t mention how the user would interact with it (should there be a toggle button etc)? This let Gemini suggest the likely most common and reasonable option which seemed good to me.
  4. Run the server and test it yourself During prototyping, you don’t really know what you want. I advise against letting gemini try to run the server. Run it yourself, see what errors Gemini runs into. Often, Gemini can get stuck trying to run the server and examining more output than it needs and this can waste valuable tokens.
  5. Loops happen, ensure you have manual validation checkpoints: I had Gemini always ask me when it wanted to run the server. Most of the time I would refuse and run it myself. I did this because I’ve seen Gemini get stuck into loops. A common one has been to run the server, find a warning, introduce a bad change to fix it, see the server fail, undo it, see the warning again, then attempt to introduce the bad change again etc etc.
  6. Eventually do add unit tests. This will allow gemini to partially validate its changes, and avoid you from having to copy paste errors you see back at it. Gemini makes mistakes like we all do. Unfortunately, I did not do this here (1-2 hours is not much time). I mainly write it because I did see Gemini run into simple errors that resulted in the server no longer working that a unit test could have caught earlier.
  7. For the simple things, it appears to follow some best practices: I was impressed by the Dockerfile. Gemini used staged builds, which means that the image was pretty lightweight and didn’t contain the npm toolset, which is unnecessary. The compressed size of the image is still 95MB, but this is a great start.

Conclusion

At each technological wave, we see ourselves transformed into a new work dynamic. Although it seems these changes are happening on a faster scale to keep up with, I believe there is one comfortable constant. To grow in our careers, we must remain focused not on specific technological advances, but value creation. What value do you create today? What would be the next step of your career that would enable you to create more value than you are doing so now? Gemini can probably help make it happen.

Citation

@misc{jlhermitte2025pdfstoreapp,
  title={Writing PDF Store App With Gemini},
  author={Julien R. Lhermitte},
  year={2025},
  howpublished={\url{https://jrmlhermitte.github.io/2025/08/20/pdf-store-app-with-gemini.html}}
}

Footnotes

  1. I want to add that such a statement does not imply agents will replace interns/junior engineers. We absolutely need software engineering knowledge. Just like we need to go to college to learn fundamental things we rarely use (e.g. Calculus for engineering), this will just be the same stepping stone interns/junior devs have always gone through except a bit more accelerated (i.e. when you master the foundations, you can build a larger project much quicker). As a matter of fact, I’m jealous because their careers will actually accelerate way faster! :) 

  2. Warning: the code isn’t the prettiest. I have avoided making any edits to try to preserve what a Gemini project would look like. 


Comments/Suggestions?

NOTE: You'll need to have a github account and give giscus comment access. This is necessary to allow it to post a comment on your behalf. If you don't feel comfortable giving giscus access, please find the corresponding topic and manually comment here.


<
Previous Post
Managing Papers With Zotero For Free Using Caddy And Tailscale
>
Blog Archive
Archive of all previous blog posts