Home About Me Main Projects Smaller Projects Resume

Document_Comparison

Document Comparison was a project that I did over the course of an internship. Because of this, I cannot show the exact final code that I created, but I can discuss it here. The program essentially takes two pdfs and highlights text that is unique to both of them, and then downloads the new pdfs. The program primarily uses PyMuPDF and PyPDF to extract text, then the python difflib to parse the output. It then highlights the two pdfs with PyMuPDF, and then downloads the two files.

This was my first professional-grade product, and it definetly consumed the greater portion of my 2023 summer to develop. I was very happy to have to opportunity to work on something like this though. I obtained a lot of experience in learning how to work with other devlopers, using github, and how to do research into packages and modules. I am overall very satisified with the end result, and so were the people I was working with!

Although I can't show off the full code, I can show off the portion of code that was used to highlight the text in PDFs. Take a look if you're curious.


View on Github