vermouth.citation_parser module

class vermouth.citation_parser.BibTexDirector(force_field)[source]

Bases: object

Lightweight parser for BibTex files. BibTex files in general have an assorment of entries that describe the corresponding sort of publication to refer to and then a number required and optional fields for the different types of entries. A field for example would be Title giving the title of a publication. The syntax in general looks as follows:

@<entry>{<some custom ID>, field = {<content>},

field = {<content>}}

Alternatively the {} can be replaced by quotation marks.

This parser only parses the version with {} as used by google scholar. In addition we do not check for missing fields or invalid fields. All fields are accepted and no fields are required.

static extract_fields(entry_string)[source]

Given an entry string without entry type and identified (i.e. ,<field_type> = {<content>}, etc.) split all the contents and field-types using a regular expression.

Parameters:

entry_string (str)

Yields:

str, str – the field type, the field content

static find_entries(citation_string)[source]

Look in a string where @ indicates the beginning of a new entry and return the indices.

Parameters:

citation_string (str)

Yields:

int – position of ‘@’ in citation_string

parse(lines)[source]

Given lines from a bibtex file parse them and update the force-field citation instance variable.

parse_entry(entry_string)[source]

Given a string describing a single entry, parse it and then update the force_field citations dict with a field dict.

pop_entry_type(entry_string)[source]

Given a string describing a single entry strip that entry from the string and return it. Note the string MUST contain the @.

Parameters:

entry_string (str)

Returns:

  • str – The entry type

  • str – The shortened string

static pop_key(entry_string)[source]

Given a string of a single entry from which the entry_type has already been removed (see pop_entry_type) get the custom ID, strip it and return the entry_string without that ID.

Parameters:

entry_string (str)

Returns:

the key and the string without key

Return type:

str, str

static prepare_file(lines)[source]

Bibtex is not sensitive to line spacing so we join the line as one string. Comment characters are not allowed.

vermouth.citation_parser.citation_formatter(citation, title=False)[source]

Very basic and minimal formatter for citations. It is adopted from basic ACS style formatting. Fields within [] are optional.

<authors> [journal] <year>; [doi]

Note that the formatter cannot format latex like syntax (e.g. a{”} for ae)

vermouth.citation_parser.read_bib(lines, force_field)[source]