Search
  • Alex Cates

Building a Python powered Resume


A few weeks ago Nick Strayer posted a package in r that would produce a resume based on data stored in a google sheet.

It was a cool application of R and I immediately wanted to make a version in python.


With the known flexibility of python, and since everyone, including python engineers, need a resume, I assumed someone had already tried to make this. A quick google brings me to a towards data science article titled "Generating a Resume in Python", seemed perfect. However, to my surprise, the approach was to use matplotlib's annotation function, creating a graph area and adding all the information one annotation at a time. While this works and has some nice benefits of being able to precisely place each piece of information based on the graph grid, it is also really limited.


How do you deal with updating information on the Resume?

How do you deal with multiple pages (which is common for academic Curriculum Vitae)?

What if I wanted to quickly redesign part of the Resume?


The goal of a python powered resume would be that it would automatically update, and one quick look showed me that adding a new line anywhere would put me in formatting hell.


I started looking for other solutions, which led me to the python package Jinja2. Built to work with HTML documents, it immediately seemed like a better option. If I could fill in an HTML page, I get all the benefits of HTML, CSS, and javascript for the initial creation of the resume. I can also have multiple HTML templates, or make quick changes to the CSS and have it update throughout. Then I could simply print out the HTML page as a pdf and viola! I have a resume.


In the next 6 steps, I will outline what I needed to do to create my python powered resume. This is not a comprehensive guide on how to do so, I will include links to other explainers that helped me each step of the way. Let this be a general guide, rather than definitive instructions. Finally, if you want to skip ahead, all the code is available at https://github.com/zanderman12/resume_builder


1. Set up a google sheet of Resume Info:


As with Nick's package, data has to be pulled from somewhere. I followed his lead with a google sheet because it is easy to set up and update while providing plenty of flexibility for what I want. You can see an example of my google sheet here.


2. Get Resume Info into python


Once you have the google sheet, you need to pull it into python. While this takes some steps, mainly to give your local python access to your google drive, I followed this great guide a while ago and now just import the 2 functions whenever I need to access a google sheet.


If you use multiple sheets as I did, you will have to import each sheet separately, but its just a few quick lines of code. Add a few lines just to clean up the columns and row names to help make the code a little easier to read.


## pull data from google sheets, 
entries_df = load_gsheet_data('YOUR GOOGLE SHEET ID', 'SHEET NAME', 100)
entries_df.columns = entries_df.iloc[0]
entries_df.drop(1, axis = 0, inplace = True)

textblocks_df = load_gsheet_data('YOUR GOOGLE SHEET ID', 'SHEET NAME', 100)
textblocks_df.columns = textblocks_df.iloc[0]
textblocks_df.drop(1, axis = 0, inplace = True)
textblocks_df.set_index('loc', inplace = True)

contact_df = load_gsheet_data('YOUR GOOGLE SHEET ID', 'SHEET NAME', 100)
contact_df.columns = contact_df.iloc[0]
contact_df.drop(1, axis = 0, inplace = True)
contact_df.set_index('loc', inplace = True)

3. Build your JINJA Template


Jinja2 is a python package that is used by some common Python-based web-frameworks (such as flask and Django). Jinja can be used to replace placeholders in an HTML doc, marked with double curly brackets such as {{name}} with a string that is sent via python. For more on how to do this, I recommend some of the JInja guides available online such as here or here.


I now had to create an HTML template. I being no designer and only loosely familiar with HTML decided to find a free HTML resume that I could modify. I ended up using this simple resume from Thomas Hardy, but you can use the same process on any HTML resume you find. This is important because now it will be easy to change the formatting of your resume in drastic ways should you want.


Now that we have a template we can take a look at the HTML and start replacing the different text blocks with Jinja placeholders, so are HTML template starts to look like below:


<section>
 <div class="sectionTitle">
 <h1>Research Experience</h1>
 </div>
 
 <div class="sectionContent">
 <article>
 <h2>{{research1_title}} at {{research1_company}}</h2>
 <p class="subDetails">{{research1_start}} - {{research1_end}}</p>
 <p>{{research1_desc}}</p>
 </article>
 
 <article>
 <h2>{{research2_title}} at {{research2_company}}</h2>
 <p class="subDetails">{{research2_start}} - {{research2_end}}</p>
 <p>{{research2_desc}}</p>
 </article>
 
 </div>
 <div class="clear"></div>
 </section>

As you can see, most of the details are just place holders. Remember every keyword surrounded by {{ }} will be replaced with a string that python sends when it renders the template. However, this code is screaming for a loop of some form. Anytime I start having to number my variables (like "research1_..." and "research2_...") I want to find a way to make a for loop. Luckily, Jinja actually supports for loops (along with other programmatic functions like if statements). It uses the code {% for x in y %} with a {%endfor%} to close the for loop. Those of you that often work in Matlab, the formating of a loop with an end statement should be familiar.


Now we can replace all the repeated variables with for loops, like so:


<section>
 <div class="sectionTitle">
 <h1>Research Experience</h1>
 </div>
 
 <div class="sectionContent">
 {%for r in research%}
 <article>
 <h2>{{r['title']}} at {{r.company}}</h2>
 <p class="subDetails">{{r['start']}} - {{r['end']}}</p>
 <p>{{r['desc']}}</p>
 </article>

 {%endfor%}
 
 </div>
 <div class="clear"></div>
 </section>

In our loop, we are looping through a list of dictionaries. This allows us to create the same type of template for each entry while keeping our Jinja template looking pythonic. This results in a much cleaner template, and more importantly, it makes no assumptions for how many items to include. I could have 1 research experience entry or 100 and the template does not need to change. This keeps the code generalizable and allows for the automatic updating of the formatting whenever I have a new experience. Now we can go back to the python side.


4. Fill the Jinja template


The great thing about Jinja is that to fill out all of these keywords, you just need to create a dictionary, with each keyword in the template as a key in the python dictionary. We can start our dictionary off with the keywords that don't need to be looped


template_vars = {'name': 'Alexander Cates',
                 'phone': contact_df.loc['phone', 'contact'],
                 'email': contact_df.loc['email','contact'],
                 'linkedin': contact_df.loc['linkedin','contact'],
                 'desired_job_title': 'PhD Student',
                 'mission_statement': textblocks_df.loc['intro', 'text']}

Here we have created the dictionary template_vars which is what we will send to Jinja later. Now we need to create the entry dictionaries which will be looped within the template.


for cat in ['education', 'research', 'teaching', 'publications', 'awards']:
    catlist = [] #list of entries per category
    for rrow in entries_df[entries_df.section == cat].iterrows():
        row = rrow[1] #to select the series object, not sure why needed
        rowdf = {}
        rowdf['title'] = row.title
        rowdf['company'] = row.institution
        rowdf['start'] = int(row.start)
        if row.end == 'Current':
            rowdf['end'] = 'Current'
        else:
            rowdf['end'] = int(row.end)
        rowdf['desc'] = row.description_1
        catlist.append(rowdf)
    template_vars[cat] = catlist

Let's unpack this. We first loop through the different sections. These obviously will change for different people, but with my academic background, I chose 'education', 'research', 'teaching', 'publications', and 'awards'. These categories are a column on my google sheet and importantly they are the same as the keywords in my Jinja template. Since we want the Jinja template to loop through something, we need to make the entry a list (in this case catlist). We then loop through the appropriate category, making a unique dictionary for each row (each entry) with all the appropriate information. This includes a couple of tricks (such as the if statement and converting the end year into an integer) just to make the final output a little prettier.


5. Render the Jinja Template


Now it's just a few lines to load the Jinja template (which remember is just an html file from above), render the template (fills in all the keywords) and save it as an html file


##load the template html doc
templateLoader = jinja2.FileSystemLoader(searchpath="./")
templateEnv = jinja2.Environment(loader=templateLoader)
TEMPLATE_FILE = "PATH TO TEMPLATE FILE/TEMPLATE FILE"
template = templateEnv.get_template(TEMPLATE_FILE)

#fill in the variables with the data
html_out = template.render(template_vars)

#create and save to an html file
html_file = open('resume.html', 'w', encoding = 'utf-8')
html_file.write(html_out)
html_file.close()

6. Convert the HTML file to a pdf


Since most Jobs want a pdf version of our resume, we can convert our HTML file into a pdf. This can be done using the package for pdfkit. I had never used pdfkit before but started with the instructions here. This did not work immediately, but adding the 2 configuration lines, as posted on StackOverflow fixed it. Once configured, it is a single call to convert our HTML resume to a pdf.


## configure pdfkit so it runs properly
##### BE SURE TO REPLACE WITH YOUR APPROPRIATE PATH
path_wkhtmltopdf = r'C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe' 
config = pdfkit.configuration(wkhtmltopdf=path_wkhtmltopdf)
pdfkit.from_file('resume.html', 'resume.pdf', 
                 options = {'page-size': 'Letter',
                            'margin-top': '0in',
                            'margin-right': '0in',
                            'margin-bottom': '0in',
                            'margin-left': '0in'},
                  configuration=config, 
                  css = PATH TO CSS FILE)

To explain each argument of pdfkit.from_file(), the first and second are the HTML in and the pdf out file names. We then specify a few options just to ensure the final document is a normal page size (since HTML files are technically infinite) and we want to remove the default margins (which are in addition to any margins we have defined in our HTML file. The configuration argument was needed to make the whole system work as mentioned above. And finally, I needed to specify the CSS file so that we get all the design and formatting we want.


One note on the CSS. The original resume was designed to be a single page and so had no designations regarding page breaks. Adding an avoid page break call to the CSS will keep the specified section together across pages. In this case, I went with the article tag so that we can keep each entry together, but allowing categories to span across pages. The following CSS is all you need to add. As I mentioned, I also changed some aspects such as the color defaults to better work with my personal aesthetic, so play with those as you like.


article {
 display:block;
 page-break-inside: avoid;
}

And there you have it. Check out all the code, including my Jinja template file, on my GitHub: https://github.com/zanderman12/resume_builder.


The real beauty of this is that it is quick and easy to build a new template if you find a different HTML resume you like. I personally have been considering buying one, though I haven't gotten there yet.


Good luck with your next Job Search!

17 views