Skip to content

Update bibliographies to include additional metadata.#335

Draft
stumbo wants to merge 4 commits into
mainfrom
bs23_schema_org_support
Draft

Update bibliographies to include additional metadata.#335
stumbo wants to merge 4 commits into
mainfrom
bs23_schema_org_support

Conversation

@stumbo

@stumbo stumbo commented Jun 8, 2026

Copy link
Copy Markdown
Member

Annotate Bibliography entries with metadata based on schema.org creative works schema and its specializations for different types of scholarly works. The end result will be pages crawled and indexed by Google and other search engines will be able to better understand and represent the material on the pages. This should allow better search results.

Additionally, in our internal search assisted AI work, this will provide additional information on the contents of these pages and provide fodder for building a more detailed model of the Interlisp space that can be leveraged in responding to queries.

What all is in this PR:

  • Updated generation of bibliography pages. layouts/_partials/bibliography-json-ld.html - this piece of code writes the json-ld containing the metadata for each bibliographic item. Specific metadata and overall format is driven by the Zotero type assigned to each bibliographic entry.
  • Update layouts/_partials/hooks/head-end.html to call bibliographic-json-ld.html
  • Moved shared publication date functionality into layouts/_partials/bibliography-date.html
  • Replaced publication date functionality in layouts/bibliography/single.html with call to bibliography-date.html
  • Minor cleanup of scripts/bibSplit.pl. Also, added placeholder for concepts - goal is to start annotating Zotero entries with information on different computer science concepts addressed by different papers. This information would then flow into the bibliographic metadata and become part of the searchable knowledge base
  • Updated config/_default/params.yaml - fixes a long standing issue with warning messages
  • Updated content/en/history/bibliography/_index.md - the other half of the warning message fix
  • Created set of test cases and added test harness in tests/ to help validate generation of metadata and correct formatting
  • Created params file used when running test cases, config/testing/hugo.yaml

To do: Further revise test cases - augment hardcoded markdown files to a combination of hardcoded and actual files generated from Zotero output. Want to flag errors that occur if Zotero output or processing change.


Metadata is encoded as JSON-LD and provides information for Google and other search engines. A first step in enabling richer data in our searches and providing richer metadata for agentic AI.

This hasn't been thoroughly tested. However I wanted to make it available. The changes are integrated on the testing site. You can see the results using Google's Rich Results test: https://search.google.com/test/rich-results/result/r%2Farticles?id=1glR_Xd8sI4btiGYnczSdg

Metadata is encoded as JSON-LD and provides information for Google and
other search engines.  A first step in enabling richer data in our
searches.
@stumbo stumbo requested review from hjellinek, masinter and pamoroso June 8, 2026 16:47
@pamoroso

pamoroso commented Jun 8, 2026

Copy link
Copy Markdown
Member

I have no idea how to test the new metadata features but browsing and searching the bibliography seems to be working fine on the staging server, except for one test case. I entered "knowledge programming in loops" in the search box expecting this bibliography entry to show up in the results:

Stefik, M., Bobrow, D. G., Mittal, S., & Conway, L. (1983). KNOWLEDGE PROGRAMMING IN LOOPS. 11.

But the entry doesn't show up at all in the search results.

@stumbo

stumbo commented Jun 9, 2026

Copy link
Copy Markdown
Member Author

There isn't much you can do to test it via searching at present. This PR only adds additional metadata to each bibliographic entry. Once this PR is merged and built into the production website and recrawled the metadata will become available and usable by search engines. Ideally the result will be better, more informative search results.

Specifically, this PR adds a script containing metadata to each bibliographic entry. Looking at the Knowledge Programming in Loops article: The following is injected into the page:

<script type="application/ld+json">{
  "@context": "https://schema.org",
  "@type": "ScholarlyArticle",
  "author": [
    {
      "@type": "Person",
      "familyName": "Stefik",
      "givenName": "Mark",
      "name": "Mark Stefik"
    },
    {
      "@type": "Person",
      "familyName": "Bobrow",
      "givenName": "Daniel G.",
      "name": "Daniel G. Bobrow"
    },
    {
      "@type": "Person",
      "familyName": "Mittal",
      "givenName": "Sanjay",
      "name": "Sanjay Mittal"
    },
    {
      "@type": "Person",
      "familyName": "Conway",
      "givenName": "Lynn",
      "name": "Lynn Conway"
    }
  ],
  "dateModified": "2025-10-24T21:18:05Z",
  "datePublished": "1983-01-01",
  "description": "Early this year fifty people took an experimental course at Xerox PARC on knowledge programming in Loops. During the course, they extended and debugged small knowledge systems in a simulated economics domain called Truckin'. Everyone learned how to use the Loops environment, formulated the knowledge for their own program, and represented it in Loops. At the end of the course a knowledge competition was run so that the strategies used in the different systems could be compared. The punchline to this story is that almost everyone learned enough about Loops to complete a small knowledge system in only three days. Although one must exercise caution in extrapolating from small experiments, the results suggest that there is substantial power in integrating multiple programming paradigms.\n",
  "inLanguage": "en-US",
  "name": "KNOWLEDGE PROGRAMMING IN LOOPS\n",
  "pagination": "11",
  "sameAs": [
    "https://www.markstefik.com/wp-content/uploads/2011/04/1983-AI-Mag-Stefik-LOOPS.pdf",
    "https://www.zotero.org/groups/2914042/items/A7ZZFYKB"
  ],
  "url": "https://stumbo.github.io/InterlispDraft.github.io/history/bibliography/a7zzfykb/"
}</script>

All of this is separate from the AI search activities. A secondary task will be to incorporate the metadata into those data used by the AI Search Engine work I've started. That will require additional work to expand the data collected from the webpages, the data we currently pass into the LLM has minimal metadata with it.

Prior to moving this PR out of the draft state, I need to spend additional time reviewing the generated metadata and ensuring it is accurate, identifying gaps and documenting the schema. We may decide there is additional upstream work we need to do either with Zotero or extracting information from Zotero,

Additional questions that need to be addressed are whether the format chosen for representing metadata is the best for our needs (JSON-LD). It seems to be the preferred method for SEO related activities. But, it may be less ideal for integration with AI.

@pamoroso

pamoroso commented Jun 9, 2026

Copy link
Copy Markdown
Member

I confirm I see the metadata in the page source.

 - Remove incorrect usage of locationCreated field
 - Fix formatting of titles.  Remove newline character at end of string
@pamoroso

Copy link
Copy Markdown
Member

At commit 3940d92 the bibliography still seems to have no issues on the staging site.

Building test cases revealed minor, incorrect or poor usages of
metadata.
@pamoroso

pamoroso commented Jul 3, 2026

Copy link
Copy Markdown
Member

Still nothing unusual on the staging site at commit 9222f83.

@stumbo stumbo changed the title Update bibliographies to include additional metadaa. Update bibliographies to include additional metadata. Jul 4, 2026
@pamoroso

pamoroso commented Jul 5, 2026

Copy link
Copy Markdown
Member

The staging site is still working with no issues at commit 90cae4c.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants