This post is a summary of how 28 hours of work was reduced to 6 hours by spending 4 hours automating a workflow with a Large Language Model (LLM) and some Node.js scripts.
When building https://coloradostateparks.net, we needed to add a large amount of information to markdown files for each park. This information was compiled from public sources on the internet.
Introducing AI to Generate Page Content
My strategy was to start with support for one park and take note of lessons learned. All content updates were performed manually to get a feel for the barebones process.
For the next park's content, screenshots of the source content from internet sites were uploaded to ChatGPT. The LLM was asked to convert the images to markdown, which could then be added to the app.
This approach worked well with one major downside: there was no way for ChatGPT to include URLs in the response. This led to additional manual work to track down each URL and manually add links to the generated content.
The next iteration of the process involved feeding the source website HTML to ChatGPT for each page to convert to markdown. This approach solved the problem of generating content with accurate links but introduced its own downsides.
The main disadvantage was that there was too much information on the page that we didn't want to include in the new web app. The solution was to open the developer tools on each source page, find the main content element, and feed the HTML for that element only into my ChatGPT prompts.
You might be wondering, "Why use an LLM? Can't you just run some code to convert HTML to markdown?" The reason is that each page has a contextual description
in the front matter of the markdown. It turns out that LLMs do a great job of summarizing page content into one sentence.
I could have run a script to convert the HTML to markdown and then fed that output to ChatGPT to get the description
. However, in addition to generating the page description, the LLM was performing additional tasks in the conversion process that would require more code to accomplish (e.g. transforming relative URL paths in the source content to absolute links). After considering it, it was clear that the effort to delegate the conversion work to a script was not worth the API cost savings.
Iterating on the Process
There were a handful of other minor tweaks to make to the prompt, but soon the output was consistently in a great place.
By this point, the LLM was doing most of the heavy lifting. Content was added for a few parks, and there was an established order of operations for supporting new parks:
- Open all relevant content source pages in the browser.
- For each page:
- Paste a canned prompt into ChatGPT.
- Inspect the DOM and find the main HTML element to copy.
- Paste the HTML for the page at the end of the prompt and submit it to ChatGPT.
- While the output generates, create the new markdown file for the new app.
- Paste the ChatGPT output into the new file
- Make any adjustments (e.g. removing images).
- Update all content source links to the corresponding internal link in the new app.
- Add a custom widget to the park overview page for contact information.
- Add a custom Google Maps widget to the directions page for the park.
There were 28 more parks to support at this stage in the app-building process.
Some parks had more pages of content than others. At an average of 20 pages per park, each park was taking about an hour to add the content and polish the UX. This meant we still had around 28 hours of work ahead of us to add the content for the rest of the parks.
We were prepared to spend 2 hours a day for two weeks in order to finish the app MVP. It was at this point that there was an idea to try to automate this process even further. If the result could ultimately save a significant amount of time needed to finish the project, spending the upfront time to explore options would be worth it.
Can We Automate More?
My first curiosity was whether we could automate the process of manually copying and pasting the prompt for ChatGPT.
Sure enough, the ChatGPT API was easy to get up and running using LangChain. In no time, we had a working Node.js script that could take a list of URLs and, for each URL, would:
- Fetch the HTML associated with the URL.
- Query for the main content DOM node.
- Add the main content HTML to the canned ChatGPT prompt.
- Use the ChatGPT API to get the converted markdown.
- Write the markdown to a file in the relevant park directory.
Here's a look at the main function of the script:
async function processPages() {if (!process.env.OPENAI_API_KEY) {console.error('Please set the OPENAI_API_KEY environment variable.');return;}if (!PARK_SLUG) {console.error('Please set the PARK_SLUG variable to the park you are adding pages for.');return;}for (const page of pages) {console.log(`Processing ${page.url}...`);try {const html = await fetchHTML(page.url);const mainElementHtml = extractMainElement(html);const markdown = await convertHtmlToMarkdown(mainElementHtml);fs.writeFileSync(PATH + page.fileName, markdown);console.log(`Successfully wrote ${page.fileName}\n`);} catch (error) {console.error(`Error processing ${page.url}:`, error);}}}processPages();
The processPages function iterates through a list of pages, fetches the HTML for each URL, extracts the main content, converts it to markdown using the ChatGPT API and writes the result to a file. This automation saves significant time compared to manual processing.
Now, the manual work was confined to updating certain files to use custom components for better UX, such as embedding Google Maps directly into the page.
There is an intuition developers need to understand for when to stop trying to automate. We could have tried to automate the remaining manual work per park, but it didn't seem like the juice was worth the squeeze to take it that far.
Wrapping Up the Project
Creating the script and iterating on it until it did most of the heavy lifting took about 4 hours. This LLM-powered script reduced the time it took to add the content for a park from roughly 1 hour to 20 minutes.
We ended up converting 10 parks while iterating on the automation logic, leaving 18 more parks to convert.
This new technique was interesting in terms of time savings. It would take 20 minutes to add support for a park regardless of the number of pages, whereas before, the more pages, the longer it would take.
With 28 hours of work ahead of us, it wasn't clear if the time spent on attempting to automate the HTML to markdown conversion with a script would be worth it. In hindsight, taking the extra time to explore ways to automate further was well worth it. Not only did it ultimately save a significant amount of work and time, but we also learned new techniques that can be leveraged for future projects.
Takeaways
By investing time in automating the content addition process, we were able to significantly reduce our workload and improve efficiency.
The lessons learned and techniques developed during this project will be invaluable for future endeavors. Automation, when applied judiciously, can be a powerful tool in a developer's toolkit.