text2bib

I (Martin J. Osborne) initiated this project in 2006 when I was the first Managing Editor of the (Open Access) journal Theoretical Economics. (I was a member of the independent group that founded the journal, which was later taken over by the Econometric Society.) The purpose of the project was to convert into BibTeX the plain text references in accepted articles so that they could easily be consistently formatted.

With funding from the Student Experience Program of Project Open Source | Open Access at the University of Toronto, Fabian Qifei Bai created the first version of the conversion script in the Spring of 2007.

When Bai graduated from the University of Toronto later in 2007, I took over the coding and wrote a front end for public use, using the Open Journals System of the Public Knowledge Project as a framework. I have continued to develop both the conversion engine and the front end since then.

Starting in the summer of 2023, I reimplemented the system using the Laravel framework, at the same time making many improvements in the conversion engine. The new version was released on 2024.3.15.

The source code is available on Github: https://github.com/osbornemj/text2bib.

The converter consists of a large number of hand-coded rules for extracting the author, title, and publication information from character strings that represent references. I make improvements to it by occasionally looking for errors in the conversions for files uploaded by users. When I see an error, I add the source and the correct version of the BibTeX entry to a database table of examples and modify the code to deal with it correctly, while still correctly converting all the other examples. The examples table currently contains 1257 items. (Unfortunately error reports by users are few and far between, and almost no user responds to clarifactory questions, so the improvement of the algorithm proceeds much more slowly that it could.)

A natural alternative approach is to use AI rather than a hand-coded algorithmm. The only barrier to doing so is cost, but surely AI is where the future lies.