Skip to content

Election scraping#2178

Open
delexagon wants to merge 4 commits into
codeforboston:mainfrom
delexagon:election-scraping
Open

Election scraping#2178
delexagon wants to merge 4 commits into
codeforboston:mainfrom
delexagon:election-scraping

Conversation

@delexagon

@delexagon delexagon commented Jun 28, 2026

Copy link
Copy Markdown
Collaborator

Summary

Backend pipeline for #2130.

Known issues

I strongly recommend looking over the data definitions in functions/src/legislators/electionTypes.ts.
If there are more than 1200 elections in a search (in our case, more than 1200 general elections in a year), they will be clipped off at 1200. Currently, the scraper does not catch this case.
The id is a hash of the data. If we want to normalize the district data to a standard format, this would change all the ids.
We should probably consider how we are going to want to connect the candidate url used as a key here to the legislator elsewhere.
May want to discuss whether there are a possibility of additional 'types' of votes like no preference, which may be rare but existing.
Does not currently check if all the votes add up to the total, which might be desirable.

Steps to test/reproduce

  1. Run curl 'http://localhost:5001/demo-dtp/us-central1/triggerPubsubFunction?scheduled=scrapeElections'. This should fetch all elections for this year (and last year if before July).
  2. Run yarn firebase-admin run-script backfillElections --env local --startYear 2024. This should fetch every general election from 2024.
  3. You may want to try fetchElectionsData in a variety of cases, for example when there are too many candidates (which requires a secondary page fetch), when there are no candidates, when there are 'No preference' votes, when there are write in candidates, etc. I've been using npx ts-node --compiler-options '{"module":"CommonJS"}' functions/src/legislators/scrapeElections.ts with
import { writeFileSync } from "fs"
(async () => {
  const data = await fetchElectionsData()
  writeFileSync("/tmp/election-results.json", JSON.stringify(data, null, 2))
})()

@vercel

vercel Bot commented Jun 28, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
maple-dev Ready Ready Preview, Comment Jun 28, 2026 3:22am

Request Review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant