
Document_Comparison
Document Comparison was a project that I did over the course of an internship. Because
of this, I cannot show the exact final code that I created, but I can discuss it here.
The program essentially takes two pdfs and highlights text that is unique to both of them,
and then downloads the new pdfs. The program primarily uses PyMuPDF and PyPDF
to extract text, then the python difflib to parse the output. It then highlights the
two pdfs with PyMuPDF, and then downloads the two files.
This was my first professional-grade product, and it definetly consumed the greater portion
of my 2023 summer to develop. I was very happy to have to opportunity to work on
something like this though. I obtained a lot of experience in learning how to work with other devlopers,
using github, and how to do research into packages and modules. I am overall very satisified
with the end result, and so were the people I was working with!
Although I can't show off the full code, I can show off the portion of code that was used to
highlight the text in PDFs. Take a look if you're curious.
View on Github