Normalize authors names in .bib file
As most of you may know, there are many acceptable ways of writing authors names. However, when I export .bib
entries from software like Zotero (which sometimes exports Last, First names and sometimes First Last names) or JabRef (which exports fields the way you first entered them) or from the Internet, I get authors names in many different ways. Although these sources rarely provide them in incorrect or unusable ways, I'd like to normalize my .bib
files so that I can Ctrl + F
authors names easily, fill in their names if they are abbreviated and so on.
I am trying to use BibTool, which I already use to clean, format and sort my files. I've tried the following rules in my .bibtoolrsc
file:
new.format.type = {17="%f%v%l%j"}
new.format.type = {17="%0f%0v%0l%0j"}
new.format.type = {17="%0f %0v %0l %0j"}
but when I run the bibtool
command, all of my other rules work, except these ones (I've tried them separately, of course).
Here is an example of what I want. I wanted something like this:
author = {Brown, Noam and Sandholm, Tuomas}
to become this:
author = {Noam Brown and Tuomas Sandholm}
Does anyone know how to achieve this? I would prefer if I could use BibTool for everything, but if someone recommends some other command, that is acceptable too.
Edit: here is my the content of my .bibtoolrsc
file.
bibtex bibtool
|
show 5 more comments
As most of you may know, there are many acceptable ways of writing authors names. However, when I export .bib
entries from software like Zotero (which sometimes exports Last, First names and sometimes First Last names) or JabRef (which exports fields the way you first entered them) or from the Internet, I get authors names in many different ways. Although these sources rarely provide them in incorrect or unusable ways, I'd like to normalize my .bib
files so that I can Ctrl + F
authors names easily, fill in their names if they are abbreviated and so on.
I am trying to use BibTool, which I already use to clean, format and sort my files. I've tried the following rules in my .bibtoolrsc
file:
new.format.type = {17="%f%v%l%j"}
new.format.type = {17="%0f%0v%0l%0j"}
new.format.type = {17="%0f %0v %0l %0j"}
but when I run the bibtool
command, all of my other rules work, except these ones (I've tried them separately, of course).
Here is an example of what I want. I wanted something like this:
author = {Brown, Noam and Sandholm, Tuomas}
to become this:
author = {Noam Brown and Tuomas Sandholm}
Does anyone know how to achieve this? I would prefer if I could use BibTool for everything, but if someone recommends some other command, that is acceptable too.
Edit: here is my the content of my .bibtoolrsc
file.
bibtex bibtool
According to the bibtool documentation, you should not have=
before the brace, so justnew.format.type {17=....}
– Andrew Swann
Feb 17 '17 at 8:22
Your attempt to sanitize is going in the wrong direction, I think: better practice isBrown, Noam and Sandholm, Tuomas
.
– jon
Feb 17 '17 at 15:30
@AndrewSwann I've tried with and without the equals sign. Still, my other rules work, except for this one.
– Douglas De Rizzo Meneghetti
Feb 18 '17 at 17:15
There are spaces in the linked file not shown in your snippet here. Do they make a difference?
– Andrew Swann
Feb 18 '17 at 17:35
Neither the presence of spaces before or after the equals sign nor the presence of an equals sign between an entry name and the opening curly brace affect the running of the program. The presence of spaces after opening the curly brace and before closing it also changes nothing.
– Douglas De Rizzo Meneghetti
Feb 20 '17 at 7:24
|
show 5 more comments
As most of you may know, there are many acceptable ways of writing authors names. However, when I export .bib
entries from software like Zotero (which sometimes exports Last, First names and sometimes First Last names) or JabRef (which exports fields the way you first entered them) or from the Internet, I get authors names in many different ways. Although these sources rarely provide them in incorrect or unusable ways, I'd like to normalize my .bib
files so that I can Ctrl + F
authors names easily, fill in their names if they are abbreviated and so on.
I am trying to use BibTool, which I already use to clean, format and sort my files. I've tried the following rules in my .bibtoolrsc
file:
new.format.type = {17="%f%v%l%j"}
new.format.type = {17="%0f%0v%0l%0j"}
new.format.type = {17="%0f %0v %0l %0j"}
but when I run the bibtool
command, all of my other rules work, except these ones (I've tried them separately, of course).
Here is an example of what I want. I wanted something like this:
author = {Brown, Noam and Sandholm, Tuomas}
to become this:
author = {Noam Brown and Tuomas Sandholm}
Does anyone know how to achieve this? I would prefer if I could use BibTool for everything, but if someone recommends some other command, that is acceptable too.
Edit: here is my the content of my .bibtoolrsc
file.
bibtex bibtool
As most of you may know, there are many acceptable ways of writing authors names. However, when I export .bib
entries from software like Zotero (which sometimes exports Last, First names and sometimes First Last names) or JabRef (which exports fields the way you first entered them) or from the Internet, I get authors names in many different ways. Although these sources rarely provide them in incorrect or unusable ways, I'd like to normalize my .bib
files so that I can Ctrl + F
authors names easily, fill in their names if they are abbreviated and so on.
I am trying to use BibTool, which I already use to clean, format and sort my files. I've tried the following rules in my .bibtoolrsc
file:
new.format.type = {17="%f%v%l%j"}
new.format.type = {17="%0f%0v%0l%0j"}
new.format.type = {17="%0f %0v %0l %0j"}
but when I run the bibtool
command, all of my other rules work, except these ones (I've tried them separately, of course).
Here is an example of what I want. I wanted something like this:
author = {Brown, Noam and Sandholm, Tuomas}
to become this:
author = {Noam Brown and Tuomas Sandholm}
Does anyone know how to achieve this? I would prefer if I could use BibTool for everything, but if someone recommends some other command, that is acceptable too.
Edit: here is my the content of my .bibtoolrsc
file.
bibtex bibtool
bibtex bibtool
edited Apr 13 '17 at 12:35
Community♦
1
1
asked Feb 17 '17 at 5:32
Douglas De Rizzo Meneghetti
442211
442211
According to the bibtool documentation, you should not have=
before the brace, so justnew.format.type {17=....}
– Andrew Swann
Feb 17 '17 at 8:22
Your attempt to sanitize is going in the wrong direction, I think: better practice isBrown, Noam and Sandholm, Tuomas
.
– jon
Feb 17 '17 at 15:30
@AndrewSwann I've tried with and without the equals sign. Still, my other rules work, except for this one.
– Douglas De Rizzo Meneghetti
Feb 18 '17 at 17:15
There are spaces in the linked file not shown in your snippet here. Do they make a difference?
– Andrew Swann
Feb 18 '17 at 17:35
Neither the presence of spaces before or after the equals sign nor the presence of an equals sign between an entry name and the opening curly brace affect the running of the program. The presence of spaces after opening the curly brace and before closing it also changes nothing.
– Douglas De Rizzo Meneghetti
Feb 20 '17 at 7:24
|
show 5 more comments
According to the bibtool documentation, you should not have=
before the brace, so justnew.format.type {17=....}
– Andrew Swann
Feb 17 '17 at 8:22
Your attempt to sanitize is going in the wrong direction, I think: better practice isBrown, Noam and Sandholm, Tuomas
.
– jon
Feb 17 '17 at 15:30
@AndrewSwann I've tried with and without the equals sign. Still, my other rules work, except for this one.
– Douglas De Rizzo Meneghetti
Feb 18 '17 at 17:15
There are spaces in the linked file not shown in your snippet here. Do they make a difference?
– Andrew Swann
Feb 18 '17 at 17:35
Neither the presence of spaces before or after the equals sign nor the presence of an equals sign between an entry name and the opening curly brace affect the running of the program. The presence of spaces after opening the curly brace and before closing it also changes nothing.
– Douglas De Rizzo Meneghetti
Feb 20 '17 at 7:24
According to the bibtool documentation, you should not have
=
before the brace, so just new.format.type {17=....}
– Andrew Swann
Feb 17 '17 at 8:22
According to the bibtool documentation, you should not have
=
before the brace, so just new.format.type {17=....}
– Andrew Swann
Feb 17 '17 at 8:22
Your attempt to sanitize is going in the wrong direction, I think: better practice is
Brown, Noam and Sandholm, Tuomas
.– jon
Feb 17 '17 at 15:30
Your attempt to sanitize is going in the wrong direction, I think: better practice is
Brown, Noam and Sandholm, Tuomas
.– jon
Feb 17 '17 at 15:30
@AndrewSwann I've tried with and without the equals sign. Still, my other rules work, except for this one.
– Douglas De Rizzo Meneghetti
Feb 18 '17 at 17:15
@AndrewSwann I've tried with and without the equals sign. Still, my other rules work, except for this one.
– Douglas De Rizzo Meneghetti
Feb 18 '17 at 17:15
There are spaces in the linked file not shown in your snippet here. Do they make a difference?
– Andrew Swann
Feb 18 '17 at 17:35
There are spaces in the linked file not shown in your snippet here. Do they make a difference?
– Andrew Swann
Feb 18 '17 at 17:35
Neither the presence of spaces before or after the equals sign nor the presence of an equals sign between an entry name and the opening curly brace affect the running of the program. The presence of spaces after opening the curly brace and before closing it also changes nothing.
– Douglas De Rizzo Meneghetti
Feb 20 '17 at 7:24
Neither the presence of spaces before or after the equals sign nor the presence of an equals sign between an entry name and the opening curly brace affect the running of the program. The presence of spaces after opening the curly brace and before closing it also changes nothing.
– Douglas De Rizzo Meneghetti
Feb 20 '17 at 7:24
|
show 5 more comments
2 Answers
2
active
oldest
votes
Here is my attempt with a Python script using bibtexparser
(note, it will replace the .bib files in-place! Modify the script if you do not want that):
#!/usr/bin/python
# -*- coding: utf-8 -*-
import os, sys
import re
import bibtexparser
from bibtexparser.bwriter import BibTexWriter
from bibtexparser.bibdatabase import BibDatabase
from bibtexparser.customization import convert_to_unicode
from bibtexparser.bparser import BibTexParser
import inspect, pprint
# kill stdout terminal buffering
buf_arg = 0
if sys.version_info[0] == 3:
os.environ['PYTHONUNBUFFERED'] = '1'
buf_arg = 1
sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', buf_arg)
sys.stderr = os.fdopen(sys.stderr.fileno(), 'w', buf_arg)
# EDIT FOR YOUR FILES - relative to current working dir
mybibfiles = ["path1/file1.bib", "path2/file2.bib"]
numcommas = 0
# homogenize_fields: Sanitize BibTeX field names, for example change `url` to `link` etc.
tbparser = BibTexParser()
tbparser.homogenize_fields = False # no dice
tbparser.alt_dict['url'] = 'url' # this finally prevents change 'url' to 'link'
for bibfile in mybibfiles:
print((bibfile, os.path.isfile(bibfile)))
with open(bibfile) as bibtex_file:
bibtex_str = bibtex_file.read()
bib_database = bibtexparser.loads(bibtex_str, tbparser)
pprint.pprint(bib_database.entries) # already here, would by default replace 'url' with 'link'!
bibdblen = len(bib_database.entries)
for icpbe, paperbibentry in enumerate(bib_database.entries):
authstr = paperbibentry['author']
if ("," in authstr):
numcommas += 1
report = "%d/%d: Comma present: '%s'"%(icpbe+1, bibdblen, authstr)
authstrauthors = authstr.split(" and ")
for ia, author in enumerate(authstrauthors):
if ("," in author):
authorparts = author.split(", ")
# the first part [0] is last name, needs to become last
# get and remove the first part, then append it as last
lastname = authorparts.pop(0)
authorparts.append(lastname)
authorfirstlast = " ".join(authorparts)
authstrauthors[ia] = authorfirstlast
paperbibentry['author'] = " and ".join(authstrauthors)
bib_database.entries[icpbe] = paperbibentry
report += " -> '%s'"%(paperbibentry['author'])
else:
report = "%d/%d: OK"%(icpbe+1, bibdblen)
if sys.version_info[0] == 3:
print(report)
else: #python 2
print(report.encode('utf-8'))
with open(bibfile, 'w') as thebibfile:
bibtex_str = bibtexparser.dumps(bib_database)
if sys.version_info[0]<3: # python 2
thebibfile.write(bibtex_str.encode('utf8'))
else: #python 3
thebibfile.write(bibtex_str)
print("nFound & converted total of %d author fields in format Last, First (with commas)."%(numcommas))
add a comment |
In trying to solve the same problem almost a year later, I found out that JabRef has an option called "Cleanup entries" under the "Quality" menu. If one adds the rule "Normalize names of persons" for the "author" and/or "editor" fields, JabRef normalizes names in the "von Last, Jr., First" format. Not exactly what the original question asks for, but since it homogenizes the way all name fields are represented in a bib file, I think it's worth mentioning.
JabRef also points out which entries are out of spec by using the "Quality" > "Check integrity" option.
It doesn't work with Bibtex extended name format (see section 3.8 of the biber manual for what that is).
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "85"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f354293%2fnormalize-authors-names-in-bib-file%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Here is my attempt with a Python script using bibtexparser
(note, it will replace the .bib files in-place! Modify the script if you do not want that):
#!/usr/bin/python
# -*- coding: utf-8 -*-
import os, sys
import re
import bibtexparser
from bibtexparser.bwriter import BibTexWriter
from bibtexparser.bibdatabase import BibDatabase
from bibtexparser.customization import convert_to_unicode
from bibtexparser.bparser import BibTexParser
import inspect, pprint
# kill stdout terminal buffering
buf_arg = 0
if sys.version_info[0] == 3:
os.environ['PYTHONUNBUFFERED'] = '1'
buf_arg = 1
sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', buf_arg)
sys.stderr = os.fdopen(sys.stderr.fileno(), 'w', buf_arg)
# EDIT FOR YOUR FILES - relative to current working dir
mybibfiles = ["path1/file1.bib", "path2/file2.bib"]
numcommas = 0
# homogenize_fields: Sanitize BibTeX field names, for example change `url` to `link` etc.
tbparser = BibTexParser()
tbparser.homogenize_fields = False # no dice
tbparser.alt_dict['url'] = 'url' # this finally prevents change 'url' to 'link'
for bibfile in mybibfiles:
print((bibfile, os.path.isfile(bibfile)))
with open(bibfile) as bibtex_file:
bibtex_str = bibtex_file.read()
bib_database = bibtexparser.loads(bibtex_str, tbparser)
pprint.pprint(bib_database.entries) # already here, would by default replace 'url' with 'link'!
bibdblen = len(bib_database.entries)
for icpbe, paperbibentry in enumerate(bib_database.entries):
authstr = paperbibentry['author']
if ("," in authstr):
numcommas += 1
report = "%d/%d: Comma present: '%s'"%(icpbe+1, bibdblen, authstr)
authstrauthors = authstr.split(" and ")
for ia, author in enumerate(authstrauthors):
if ("," in author):
authorparts = author.split(", ")
# the first part [0] is last name, needs to become last
# get and remove the first part, then append it as last
lastname = authorparts.pop(0)
authorparts.append(lastname)
authorfirstlast = " ".join(authorparts)
authstrauthors[ia] = authorfirstlast
paperbibentry['author'] = " and ".join(authstrauthors)
bib_database.entries[icpbe] = paperbibentry
report += " -> '%s'"%(paperbibentry['author'])
else:
report = "%d/%d: OK"%(icpbe+1, bibdblen)
if sys.version_info[0] == 3:
print(report)
else: #python 2
print(report.encode('utf-8'))
with open(bibfile, 'w') as thebibfile:
bibtex_str = bibtexparser.dumps(bib_database)
if sys.version_info[0]<3: # python 2
thebibfile.write(bibtex_str.encode('utf8'))
else: #python 3
thebibfile.write(bibtex_str)
print("nFound & converted total of %d author fields in format Last, First (with commas)."%(numcommas))
add a comment |
Here is my attempt with a Python script using bibtexparser
(note, it will replace the .bib files in-place! Modify the script if you do not want that):
#!/usr/bin/python
# -*- coding: utf-8 -*-
import os, sys
import re
import bibtexparser
from bibtexparser.bwriter import BibTexWriter
from bibtexparser.bibdatabase import BibDatabase
from bibtexparser.customization import convert_to_unicode
from bibtexparser.bparser import BibTexParser
import inspect, pprint
# kill stdout terminal buffering
buf_arg = 0
if sys.version_info[0] == 3:
os.environ['PYTHONUNBUFFERED'] = '1'
buf_arg = 1
sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', buf_arg)
sys.stderr = os.fdopen(sys.stderr.fileno(), 'w', buf_arg)
# EDIT FOR YOUR FILES - relative to current working dir
mybibfiles = ["path1/file1.bib", "path2/file2.bib"]
numcommas = 0
# homogenize_fields: Sanitize BibTeX field names, for example change `url` to `link` etc.
tbparser = BibTexParser()
tbparser.homogenize_fields = False # no dice
tbparser.alt_dict['url'] = 'url' # this finally prevents change 'url' to 'link'
for bibfile in mybibfiles:
print((bibfile, os.path.isfile(bibfile)))
with open(bibfile) as bibtex_file:
bibtex_str = bibtex_file.read()
bib_database = bibtexparser.loads(bibtex_str, tbparser)
pprint.pprint(bib_database.entries) # already here, would by default replace 'url' with 'link'!
bibdblen = len(bib_database.entries)
for icpbe, paperbibentry in enumerate(bib_database.entries):
authstr = paperbibentry['author']
if ("," in authstr):
numcommas += 1
report = "%d/%d: Comma present: '%s'"%(icpbe+1, bibdblen, authstr)
authstrauthors = authstr.split(" and ")
for ia, author in enumerate(authstrauthors):
if ("," in author):
authorparts = author.split(", ")
# the first part [0] is last name, needs to become last
# get and remove the first part, then append it as last
lastname = authorparts.pop(0)
authorparts.append(lastname)
authorfirstlast = " ".join(authorparts)
authstrauthors[ia] = authorfirstlast
paperbibentry['author'] = " and ".join(authstrauthors)
bib_database.entries[icpbe] = paperbibentry
report += " -> '%s'"%(paperbibentry['author'])
else:
report = "%d/%d: OK"%(icpbe+1, bibdblen)
if sys.version_info[0] == 3:
print(report)
else: #python 2
print(report.encode('utf-8'))
with open(bibfile, 'w') as thebibfile:
bibtex_str = bibtexparser.dumps(bib_database)
if sys.version_info[0]<3: # python 2
thebibfile.write(bibtex_str.encode('utf8'))
else: #python 3
thebibfile.write(bibtex_str)
print("nFound & converted total of %d author fields in format Last, First (with commas)."%(numcommas))
add a comment |
Here is my attempt with a Python script using bibtexparser
(note, it will replace the .bib files in-place! Modify the script if you do not want that):
#!/usr/bin/python
# -*- coding: utf-8 -*-
import os, sys
import re
import bibtexparser
from bibtexparser.bwriter import BibTexWriter
from bibtexparser.bibdatabase import BibDatabase
from bibtexparser.customization import convert_to_unicode
from bibtexparser.bparser import BibTexParser
import inspect, pprint
# kill stdout terminal buffering
buf_arg = 0
if sys.version_info[0] == 3:
os.environ['PYTHONUNBUFFERED'] = '1'
buf_arg = 1
sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', buf_arg)
sys.stderr = os.fdopen(sys.stderr.fileno(), 'w', buf_arg)
# EDIT FOR YOUR FILES - relative to current working dir
mybibfiles = ["path1/file1.bib", "path2/file2.bib"]
numcommas = 0
# homogenize_fields: Sanitize BibTeX field names, for example change `url` to `link` etc.
tbparser = BibTexParser()
tbparser.homogenize_fields = False # no dice
tbparser.alt_dict['url'] = 'url' # this finally prevents change 'url' to 'link'
for bibfile in mybibfiles:
print((bibfile, os.path.isfile(bibfile)))
with open(bibfile) as bibtex_file:
bibtex_str = bibtex_file.read()
bib_database = bibtexparser.loads(bibtex_str, tbparser)
pprint.pprint(bib_database.entries) # already here, would by default replace 'url' with 'link'!
bibdblen = len(bib_database.entries)
for icpbe, paperbibentry in enumerate(bib_database.entries):
authstr = paperbibentry['author']
if ("," in authstr):
numcommas += 1
report = "%d/%d: Comma present: '%s'"%(icpbe+1, bibdblen, authstr)
authstrauthors = authstr.split(" and ")
for ia, author in enumerate(authstrauthors):
if ("," in author):
authorparts = author.split(", ")
# the first part [0] is last name, needs to become last
# get and remove the first part, then append it as last
lastname = authorparts.pop(0)
authorparts.append(lastname)
authorfirstlast = " ".join(authorparts)
authstrauthors[ia] = authorfirstlast
paperbibentry['author'] = " and ".join(authstrauthors)
bib_database.entries[icpbe] = paperbibentry
report += " -> '%s'"%(paperbibentry['author'])
else:
report = "%d/%d: OK"%(icpbe+1, bibdblen)
if sys.version_info[0] == 3:
print(report)
else: #python 2
print(report.encode('utf-8'))
with open(bibfile, 'w') as thebibfile:
bibtex_str = bibtexparser.dumps(bib_database)
if sys.version_info[0]<3: # python 2
thebibfile.write(bibtex_str.encode('utf8'))
else: #python 3
thebibfile.write(bibtex_str)
print("nFound & converted total of %d author fields in format Last, First (with commas)."%(numcommas))
Here is my attempt with a Python script using bibtexparser
(note, it will replace the .bib files in-place! Modify the script if you do not want that):
#!/usr/bin/python
# -*- coding: utf-8 -*-
import os, sys
import re
import bibtexparser
from bibtexparser.bwriter import BibTexWriter
from bibtexparser.bibdatabase import BibDatabase
from bibtexparser.customization import convert_to_unicode
from bibtexparser.bparser import BibTexParser
import inspect, pprint
# kill stdout terminal buffering
buf_arg = 0
if sys.version_info[0] == 3:
os.environ['PYTHONUNBUFFERED'] = '1'
buf_arg = 1
sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', buf_arg)
sys.stderr = os.fdopen(sys.stderr.fileno(), 'w', buf_arg)
# EDIT FOR YOUR FILES - relative to current working dir
mybibfiles = ["path1/file1.bib", "path2/file2.bib"]
numcommas = 0
# homogenize_fields: Sanitize BibTeX field names, for example change `url` to `link` etc.
tbparser = BibTexParser()
tbparser.homogenize_fields = False # no dice
tbparser.alt_dict['url'] = 'url' # this finally prevents change 'url' to 'link'
for bibfile in mybibfiles:
print((bibfile, os.path.isfile(bibfile)))
with open(bibfile) as bibtex_file:
bibtex_str = bibtex_file.read()
bib_database = bibtexparser.loads(bibtex_str, tbparser)
pprint.pprint(bib_database.entries) # already here, would by default replace 'url' with 'link'!
bibdblen = len(bib_database.entries)
for icpbe, paperbibentry in enumerate(bib_database.entries):
authstr = paperbibentry['author']
if ("," in authstr):
numcommas += 1
report = "%d/%d: Comma present: '%s'"%(icpbe+1, bibdblen, authstr)
authstrauthors = authstr.split(" and ")
for ia, author in enumerate(authstrauthors):
if ("," in author):
authorparts = author.split(", ")
# the first part [0] is last name, needs to become last
# get and remove the first part, then append it as last
lastname = authorparts.pop(0)
authorparts.append(lastname)
authorfirstlast = " ".join(authorparts)
authstrauthors[ia] = authorfirstlast
paperbibentry['author'] = " and ".join(authstrauthors)
bib_database.entries[icpbe] = paperbibentry
report += " -> '%s'"%(paperbibentry['author'])
else:
report = "%d/%d: OK"%(icpbe+1, bibdblen)
if sys.version_info[0] == 3:
print(report)
else: #python 2
print(report.encode('utf-8'))
with open(bibfile, 'w') as thebibfile:
bibtex_str = bibtexparser.dumps(bib_database)
if sys.version_info[0]<3: # python 2
thebibfile.write(bibtex_str.encode('utf8'))
else: #python 3
thebibfile.write(bibtex_str)
print("nFound & converted total of %d author fields in format Last, First (with commas)."%(numcommas))
edited Aug 24 '17 at 7:58
answered Aug 23 '17 at 16:23
sdaau
9,101647126
9,101647126
add a comment |
add a comment |
In trying to solve the same problem almost a year later, I found out that JabRef has an option called "Cleanup entries" under the "Quality" menu. If one adds the rule "Normalize names of persons" for the "author" and/or "editor" fields, JabRef normalizes names in the "von Last, Jr., First" format. Not exactly what the original question asks for, but since it homogenizes the way all name fields are represented in a bib file, I think it's worth mentioning.
JabRef also points out which entries are out of spec by using the "Quality" > "Check integrity" option.
It doesn't work with Bibtex extended name format (see section 3.8 of the biber manual for what that is).
add a comment |
In trying to solve the same problem almost a year later, I found out that JabRef has an option called "Cleanup entries" under the "Quality" menu. If one adds the rule "Normalize names of persons" for the "author" and/or "editor" fields, JabRef normalizes names in the "von Last, Jr., First" format. Not exactly what the original question asks for, but since it homogenizes the way all name fields are represented in a bib file, I think it's worth mentioning.
JabRef also points out which entries are out of spec by using the "Quality" > "Check integrity" option.
It doesn't work with Bibtex extended name format (see section 3.8 of the biber manual for what that is).
add a comment |
In trying to solve the same problem almost a year later, I found out that JabRef has an option called "Cleanup entries" under the "Quality" menu. If one adds the rule "Normalize names of persons" for the "author" and/or "editor" fields, JabRef normalizes names in the "von Last, Jr., First" format. Not exactly what the original question asks for, but since it homogenizes the way all name fields are represented in a bib file, I think it's worth mentioning.
JabRef also points out which entries are out of spec by using the "Quality" > "Check integrity" option.
It doesn't work with Bibtex extended name format (see section 3.8 of the biber manual for what that is).
In trying to solve the same problem almost a year later, I found out that JabRef has an option called "Cleanup entries" under the "Quality" menu. If one adds the rule "Normalize names of persons" for the "author" and/or "editor" fields, JabRef normalizes names in the "von Last, Jr., First" format. Not exactly what the original question asks for, but since it homogenizes the way all name fields are represented in a bib file, I think it's worth mentioning.
JabRef also points out which entries are out of spec by using the "Quality" > "Check integrity" option.
It doesn't work with Bibtex extended name format (see section 3.8 of the biber manual for what that is).
answered 37 mins ago
Douglas De Rizzo Meneghetti
442211
442211
add a comment |
add a comment |
Thanks for contributing an answer to TeX - LaTeX Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f354293%2fnormalize-authors-names-in-bib-file%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
According to the bibtool documentation, you should not have
=
before the brace, so justnew.format.type {17=....}
– Andrew Swann
Feb 17 '17 at 8:22
Your attempt to sanitize is going in the wrong direction, I think: better practice is
Brown, Noam and Sandholm, Tuomas
.– jon
Feb 17 '17 at 15:30
@AndrewSwann I've tried with and without the equals sign. Still, my other rules work, except for this one.
– Douglas De Rizzo Meneghetti
Feb 18 '17 at 17:15
There are spaces in the linked file not shown in your snippet here. Do they make a difference?
– Andrew Swann
Feb 18 '17 at 17:35
Neither the presence of spaces before or after the equals sign nor the presence of an equals sign between an entry name and the opening curly brace affect the running of the program. The presence of spaces after opening the curly brace and before closing it also changes nothing.
– Douglas De Rizzo Meneghetti
Feb 20 '17 at 7:24