publications: - title: "Structure and content analysis for html medical articles: a hidden markov model approach" author: - name: "Jie Zou" link: "https://researchr.org/alias/jie-zou" - name: "Daniel X. Le" link: "https://researchr.org/alias/daniel-x.-le" - name: "George R. Thoma" link: "https://researchr.org/alias/george-r.-thoma" year: "2007" doi: "http://doi.acm.org/10.1145/1284420.1284468" links: doi: "http://doi.acm.org/10.1145/1284420.1284468" tags: - "analysis" - "Markov" - "systematic-approach" researchr: "https://researchr.org/publication/ZouLT07" cites: 0 citedby: 0 pages: "199-201" booktitle: "DOCENG" kind: "inproceedings" key: "ZouLT07" - title: "Bibliographic component extraction from references based on a text recognition error model" author: - name: "Atsuhiro Takasu" link: "https://researchr.org/alias/atsuhiro-takasu" - name: "Kenro Aihara" link: "https://researchr.org/alias/kenro-aihara" year: "2005" doi: "http://dx.doi.org/10.1002/scj.20323" links: doi: "http://dx.doi.org/10.1002/scj.20323" tags: - "rule-based" - "bibliography" researchr: "https://researchr.org/publication/TakasuA05" cites: 0 citedby: 0 journal: "scjapan" volume: "36" number: "7" pages: "13-22" kind: "article" key: "TakasuA05" - title: "A Decentral Library for Scientific Articles" author: - name: "Markus Wulff" link: "https://researchr.org/alias/markus-wulff" year: "2002" doi: "http://link.springer.de/link/service/series/0558/bibs/2346/23460143.htm" links: doi: "http://link.springer.de/link/service/series/0558/bibs/2346/23460143.htm" researchr: "https://researchr.org/publication/Wulff02%3A0" cites: 0 citedby: 0 pages: "143-152" booktitle: "iics" kind: "inproceedings" key: "Wulff02:0" - title: "Linking Micro and Macro Description of Scalable Social Systems Using Reference Nets" author: - name: "Michael Köhler" link: "https://researchr.org/alias/michael-k%C3%B6hler" - name: "Daniel Moldt" link: "https://researchr.org/alias/daniel-moldt" - name: "Heiko Rölke" link: "https://researchr.org/alias/heiko-r%C3%B6lke" - name: "Rüdiger Valk" link: "https://researchr.org/alias/r%C3%BCdiger-valk" year: "2005" doi: "http://dx.doi.org/10.1007/11594116_4" links: doi: "http://dx.doi.org/10.1007/11594116_4" tags: - "macros" - "social" researchr: "https://researchr.org/publication/KohlerMRV05" cites: 0 citedby: 0 pages: "51-67" booktitle: "dfki" kind: "inproceedings" key: "KohlerMRV05" - title: "How much of cited conference materials can be found using bibliographic tools?" author: - name: "Haruyuki Ogawa" link: "https://researchr.org/alias/haruyuki-ogawa" - name: "Nobuyuki Midorikawa" link: "https://researchr.org/alias/nobuyuki-midorikawa" - name: "Chie Yoshikawa" link: "https://researchr.org/alias/chie-yoshikawa" - name: "Ken-ichiro Saito" link: "https://researchr.org/alias/ken-ichiro-saito" - name: "Hiroshi Itsumura" link: "https://researchr.org/alias/hiroshi-itsumura" - name: "Masatsugo Kaneko" link: "https://researchr.org/alias/masatsugo-kaneko" - name: "Emiko Niki" link: "https://researchr.org/alias/emiko-niki" year: "1989" tags: - "bibliography" - "bibliographic tools" researchr: "https://researchr.org/publication/OgawaMYSIKN89" cites: 0 citedby: 0 journal: "jasis" volume: "40" number: "5" pages: "350-355" kind: "article" key: "OgawaMYSIKN89" - title: "Improving access to scientific literature" author: - name: "Steve Lawrence" link: "http://research.google.com/pubs/author103.html" year: "2002" doi: "http://doi.acm.org/10.1145/584931.584932" abstract: "CiteSeer (also known as ResearchIndex) is a digital library of scientific literature that aims to improve communication and progress in science. CiteSeer features include automatic metadata extraction, autonomous citation indexing, graph analysis, citation context extraction, and related document computation. This talk covers the design, implementation, and operation of CiteSeer.Steve Lawrence is a Senior Research Scientist at NEC Research Institute, Princeton, NJ. His research interests include information retrieval and machine learning. Dr. Lawrence has published over 50 papers in these areas, including articles in Science, Nature, CACM, and IEEE Computer. He has been interviewed by over 100 news organizations including the New York Times, Wall Street Journal, Washington Post, Reuters, Associated Press, CNN, MSNBC, BBC, and NPR. Hundreds of articles about his research have appeared worldwide in over 10 different languages. " links: doi: "http://doi.acm.org/10.1145/584931.584932" tags: - "design science" - "ResearchIndex" - "machine learning" - "information retrieval" - "digital library" - "design research" - "analysis" - "language design" - "graph-rewriting" - "digital libraries" - "e-science" - "context-aware" - "rewriting" - "design" researchr: "https://researchr.org/publication/Lawrence02" cites: 0 citedby: 0 pages: "55" booktitle: "widm" kind: "inproceedings" key: "Lawrence02" - title: "The S-Link-S Framework for Reference Linking: Architecture and Implementation" author: - name: "Eric Hellman" link: "https://researchr.org/alias/eric-hellman" year: "1999" doi: "http://urn.kb.se/resolve?urn=urn:nbn:se:elpub-9907" links: doi: "http://urn.kb.se/resolve?urn=urn:nbn:se:elpub-9907" tags: - "reference linking" - "architecture" researchr: "https://researchr.org/publication/Hellman99" cites: 0 citedby: 0 booktitle: "elpub" kind: "inproceedings" key: "Hellman99" - title: "Bibliographic Attribute Extraction from Erroneous References Based on a Statistical Model" author: - name: "Atsuhiro Takasu" link: "https://researchr.org/alias/atsuhiro-takasu" year: "2003" doi: "http://csdl.computer.org/comp/proceedings/jcdl/2003/1939/00/19390049abs.htm" links: doi: "http://csdl.computer.org/comp/proceedings/jcdl/2003/1939/00/19390049abs.htm" tags: - "rule-based" - "bibliography" researchr: "https://researchr.org/publication/Takasu03%3A0" cites: 0 citedby: 0 pages: "49-60" booktitle: "JCDL" kind: "inproceedings" key: "Takasu03:0" - title: "An Architecture for Automatic Reference Linking" author: - name: "Donna Bergmark" link: "https://researchr.org/alias/donna-bergmark" - name: "Carl Lagoze" link: "https://researchr.org/alias/carl-lagoze" year: "2001" doi: "http://link.springer.de/link/service/series/0558/bibs/2163/21630115.htm" links: doi: "http://link.springer.de/link/service/series/0558/bibs/2163/21630115.htm" tags: - "architecture" researchr: "https://researchr.org/publication/BergmarkL01" cites: 0 citedby: 0 pages: "115-126" booktitle: "ercimdl" kind: "inproceedings" key: "BergmarkL01" - title: "A Segmentation Method for Bibliographic References by Contextual Tagging of Fields" author: - name: "Dominique Besagni" link: "https://researchr.org/alias/dominique-besagni" - name: "Abdel Belaïd" link: "https://researchr.org/alias/abdel-bela%C3%AFd" - name: "Nelly Benet" link: "https://researchr.org/alias/nelly-benet" year: "2003" doi: "http://csdl.computer.org/comp/proceedings/icdar/2003/1960/01/196010384abs.htm" links: doi: "http://csdl.computer.org/comp/proceedings/icdar/2003/1960/01/196010384abs.htm" tags: - "bibliography" - "tagging" researchr: "https://researchr.org/publication/BesagniBB03" cites: 0 citedby: 0 pages: "384-388" booktitle: "icdar" kind: "inproceedings" key: "BesagniBB03" - title: "Subject and citation indexing, Part I: The clustering structure of composite representations in the Cystic Fibrosis Document Collection" author: - name: "William M. Shaw Jr." link: "https://researchr.org/alias/william-m.-shaw-jr." year: "1991" researchr: "https://researchr.org/publication/Shaw91" cites: 0 citedby: 0 journal: "jasis" volume: "42" number: "9" pages: "669-675" kind: "article" key: "Shaw91" - title: "Maintaining an Online Bibliographical Database: The Problem of Data Quality" author: - name: "Michael Ley" link: "http://www.informatik.uni-trier.de/~ley/" - name: "Patrick Reuther" link: "https://researchr.org/alias/patrick-reuther" year: "2006" doi: "http://dblp.uni-trier.de/papers/EGC06_ML_PR.pdf" links: doi: "http://dblp.uni-trier.de/papers/EGC06_ML_PR.pdf" tags: - "bibliography" - "data-flow" - "bibliographic databases" - "database" researchr: "https://researchr.org/publication/LeyR06" cites: 0 citedby: 0 pages: "5-10" booktitle: "f-egc" kind: "inproceedings" key: "LeyR06" - title: "A critical value-added service for e-journals on classics: proposal of a semantic reference linking system between on-line primary and secondary sources" author: - name: "Matteo Romanello" link: "https://researchr.org/alias/matteo-romanello" year: "2008" doi: "http://urn.kb.se/resolve?urn=urn:nbn:se:elpub-401_elpub2008" links: doi: "http://urn.kb.se/resolve?urn=urn:nbn:se:elpub-401_elpub2008" tags: - "source-to-source" - "e-science" - "open-source" researchr: "https://researchr.org/publication/Romanello08" cites: 0 citedby: 0 pages: "401-414" booktitle: "elpub" kind: "inproceedings" key: "Romanello08" - title: "Bibliographic and Web citations: What is the difference?" author: - name: "Liwen Vaughan" link: "https://researchr.org/alias/liwen-vaughan" - name: "Debora Shaw" link: "https://researchr.org/alias/debora-shaw" year: "2003" doi: "http://dx.doi.org/10.1002/asi.10338" links: doi: "http://dx.doi.org/10.1002/asi.10338" tags: - "bibliography" researchr: "https://researchr.org/publication/VaughanS03" cites: 0 citedby: 0 journal: "jasis" volume: "54" number: "14" pages: "1313-1322" kind: "article" key: "VaughanS03" - title: "Reference Directed Indexing: Redeeming Relevance for Subject Search in Citation Indexes" author: - name: "Shannon Bradshaw" link: "https://researchr.org/alias/shannon-bradshaw" year: "2003" tags: - "search" researchr: "https://researchr.org/publication/Bradshaw03%3A0" cites: 0 citedby: 0 pages: "499-510" booktitle: "ercimdl" kind: "inproceedings" key: "Bradshaw03:0" - title: "Looking for Entities in Bibliographic Records" author: - name: "Trond Aalberg" link: "https://researchr.org/alias/trond-aalberg" - name: "Maja Zumer" link: "https://researchr.org/alias/maja-zumer" year: "2008" doi: "http://dx.doi.org/10.1007/978-3-540-89533-6_36" links: doi: "http://dx.doi.org/10.1007/978-3-540-89533-6_36" tags: - "bibliography" researchr: "https://researchr.org/publication/AalbergZ08" cites: 0 citedby: 0 pages: "327-330" booktitle: "ICADL" kind: "inproceedings" key: "AalbergZ08" - title: "Integration of simultaneous searching and reference linking across bibliographic resources on the web" author: - name: "William H. Mischo" link: "https://researchr.org/alias/william-h.-mischo" - name: "Thomas G. Habing" link: "https://researchr.org/alias/thomas-g.-habing" - name: "Timothy W. Cole" link: "https://researchr.org/alias/timothy-w.-cole" year: "2002" doi: "http://doi.acm.org/10.1145/544220.544244" links: doi: "http://doi.acm.org/10.1145/544220.544244" tags: - "bibliography" researchr: "https://researchr.org/publication/MischoHC02" cites: 0 citedby: 0 pages: "119-125" booktitle: "JCDL" kind: "inproceedings" key: "MischoHC02" - title: "Bibliographical Meta Search Engine for the Retrieval of Scientific Articles" author: - name: "Artur Gajek" link: "https://researchr.org/alias/artur-gajek" - name: "Stefan Klink" link: "https://researchr.org/alias/stefan-klink" - name: "Patrick Reuther" link: "https://researchr.org/alias/patrick-reuther" - name: "Bernd Walter" link: "https://researchr.org/alias/bernd-walter" - name: "Alexander Weber" link: "https://researchr.org/alias/alexander-weber" year: "2007" doi: "http://dx.doi.org/10.1007/978-3-540-74851-9_42" links: doi: "http://dx.doi.org/10.1007/978-3-540-74851-9_42" tags: - "bibliography" - "meta-model" - "Meta-Environment" - "search" - "meta-objects" researchr: "https://researchr.org/publication/GajekKRWW07" cites: 0 citedby: 0 pages: "458-461" booktitle: "ercimdl" kind: "inproceedings" key: "GajekKRWW07" - title: "Logical Structure Recognition of Scientific Bibliographic References" author: - name: "Francois Parmentier" link: "https://researchr.org/alias/francois-parmentier" - name: "Abdel Belaïd" link: "https://researchr.org/alias/abdel-bela%C3%AFd" year: "1997" doi: "http://doi.ieeecomputersociety.org/10.1109/ICDAR.1997.620673" abstract: "In this paper, we are presenting an approach for the logical structure recognition of bibliographic references. The objective is to produce, for each reference (given in a display format, as postscript), structured data containing the hierarchy of fields recognized. As a result of variation among bibliographic references (in order and typographic format of fields, or writing style of the author, for example), we need a robust and tolerant system architecture. Thus, recognition is performed by a concept-oriented system that uses a model automatically built from a reference database. This model represents the reference fields and includes statistics on the occurrence of their terms. Recognition is achieved by a step-by-step activation of the more pertinent concepts. Each activated concept causes the execution of an appropriate searching agent. This architecture is robust and non-deterministic, allowing a solution even in difficult cases." links: doi: "http://doi.ieeecomputersociety.org/10.1109/ICDAR.1997.620673" "pdf": "http://francois.parmentier.free.fr/travail/parmenti-icdar97.pdf" tags: - "bibliography" - "meta-model" - "architecture" - "data-flow" - "bibliographic databases" - "writing" - "database" - "Meta-Environment" - "reference parsing" - "parsing" - "systematic-approach" researchr: "https://researchr.org/publication/ParmentierB97" cites: 0 citedby: 0 pages: "1072" booktitle: "icdar" kind: "inproceedings" key: "ParmentierB97" - title: "CiteSeer: An Automatic Citation Indexing System" author: - name: "C. Lee Giles" link: "https://researchr.org/alias/c.-lee-giles" - name: "Kurt D. Bollacker" link: "https://researchr.org/alias/kurt-d.-bollacker" - name: "Steve Lawrence" link: "http://research.google.com/pubs/author103.html" year: "1998" doi: "db/conf/dl/GilesBL98.html" tags: - "automatic citation indexing" - "C++" - "citation indexing" researchr: "https://researchr.org/publication/GilesBL98" cites: 0 citedby: 0 pages: "89-98" booktitle: "DL" kind: "inproceedings" key: "GilesBL98" - title: "Origins of Bibliometrics, Citation Indexing, and Citation Analysis: The Neglected Legal Literature" author: - name: "Fred R. Shapiro" link: "https://researchr.org/alias/fred-r.-shapiro" year: "1992" tags: - "analysis" researchr: "https://researchr.org/publication/Shapiro92%3A2" cites: 0 citedby: 0 journal: "jasis" volume: "43" number: "5" pages: "337-339" kind: "article" key: "Shapiro92:2" - title: "Integrating Bibliographical Data from Heterogeneous Digital Libraries" author: - name: "Eike Schallehn" link: "https://researchr.org/alias/eike-schallehn" - name: "Martin Endig" link: "https://researchr.org/alias/martin-endig" - name: "Kai-Uwe Sattler" link: "https://researchr.org/alias/kai-uwe-sattler" year: "2000" abstract: "The integration of bibliographical data today is considered one of the most important tasks in the area of digital libraries. Various available sources of bibliographical information vary widely in terms of data representation and access interfaces. To overcome this heterogeneity during the last years attempts were made to apply methods developed for information system integration, like federated databases and mediators. In this paper we describe our approach using the loosely coupled federated system FRAQL. Furthermore, we present a generic adapter that can be used in highly distributed scenarios which uses XML and related technology for transfer and homogenization of data. As an application scenario we describe global citation linking for integrated digital libraries." links: "pdf": "http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.18.5093&rep=rep1&type=pdf" tags: - "bibliography" - "XML" - "XML Schema" - "digital library" - "bibliographical data integration" - "data-flow" - "source-to-source" - "bibliographic databases" - "digital libraries" - "systematic-approach" - "open-source" researchr: "https://researchr.org/publication/SchallehnES00%3A0" cites: 0 citedby: 0 pages: "161-170" booktitle: "adbis" kind: "inproceedings" key: "SchallehnES00:0" - title: "Analysing Social Networks Within Bibliographical Data" author: - name: "Stefan Klink" link: "https://researchr.org/alias/stefan-klink" - name: "Patrick Reuther" link: "https://researchr.org/alias/patrick-reuther" - name: "Alexander Weber" link: "https://researchr.org/alias/alexander-weber" - name: "Bernd Walter" link: "https://researchr.org/alias/bernd-walter" - name: "Michael Ley" link: "http://www.informatik.uni-trier.de/~ley/" year: "2006" doi: "http://dx.doi.org/10.1007/11827405_23" abstract: "Finding relationships between authors and thematic similar publications is getting harder and harder due to the mass of information and the rapid growth of the number of scientific workers. The io-port.net portal and the DBLP Computer Science Bibliography including more than 2,000,000 and 750,000 publications, respectively, from more than 450,000 authors are major services used by thousands of computer scientists which provides fundamental support for scientists searching for publications or other scientists in similar communities. In this paper, we describe a user–friendly interface which plays the central role in searching authors and publications and analysing social networks on the basis of bibliographical data. After introducing the concept of multi-mode social networks, the DBL–Browser itself and various methods for multi-layered browsing through social networks are described. " links: doi: "http://dx.doi.org/10.1007/11827405_23" tags: - "DBLP" - "bibliography" - "bibliographical data" - "data-flow" - "e-science" - "social networks" - "social" researchr: "https://researchr.org/publication/KlinkRWWL06" cites: 0 citedby: 0 pages: "234-243" booktitle: "DEXA" kind: "inproceedings" key: "KlinkRWWL06" - title: "Mining Research Communities in Bibliographical Data" author: - name: "Osmar R. Zaïane" link: "https://researchr.org/alias/osmar-r.-za%C3%AFane" - name: "Jiyang Chen" link: "https://researchr.org/alias/jiyang-chen" - name: "Randy Goebel" link: "https://researchr.org/alias/randy-goebel" year: "2007" doi: "http://dx.doi.org/10.1007/978-3-642-00528-2_4" abstract: "Extracting information from very large collections of structured, semi-structured or even unstructured data can be a considerable challenge when much of the hidden information is implicit within relationships among entities in the data. Social networks are such data collections in which relationships play a vital role in the knowledge these networks can convey. A bibliographic database is an essential tool for the research community, yet finding and making use of relationships comprised within such a social network is difficult. In this paper we introduce DBconnect, a prototype that exploits the social network coded within the DBLP database by drawing on a new random walk approach to reveal interesting knowledge about the research community and even recommend collaborations. This work is based on an earlier work: DBconnect: mining research community on DBLP data, in Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, COPYRIGHT ACM, 2007, http://portal.acm.org/ citation.cfm?doid=1348549.1348558" links: doi: "http://dx.doi.org/10.1007/978-3-642-00528-2_4" tags: - "DBLP" - "rule-based" - "bibliography" - "bibliographical data" - "social web" - "analysis" - "data-flow" - "bibliographic databases" - "database" - "social networks" - "social" - "data-flow analysis" - "systematic-approach" researchr: "https://researchr.org/publication/ZaianeCG07" cites: 0 citedby: 0 pages: "59-76" booktitle: "kdd" kind: "inproceedings" key: "ZaianeCG07" - title: "Identification of Bibliographic Information Written in Both Japanese and English" author: - name: "Yuko Taniguchi" link: "https://researchr.org/alias/yuko-taniguchi" - name: "Hidetsugu Nanba" link: "https://researchr.org/alias/hidetsugu-nanba" year: "2008" doi: "http://dx.doi.org/10.1007/978-3-540-87599-4_55" links: doi: "http://dx.doi.org/10.1007/978-3-540-87599-4_55" tags: - "bibliography" researchr: "https://researchr.org/publication/TaniguchiN08" cites: 0 citedby: 0 pages: "431-433" booktitle: "ercimdl" kind: "inproceedings" key: "TaniguchiN08" - title: "Collaborative bibliography" author: - name: "David G. Hendry" link: "https://researchr.org/alias/david-g.-hendry" - name: "J. R. Jenkins" link: "https://researchr.org/alias/j.-r.-jenkins" - name: "Joseph F. McCarthy" link: "https://researchr.org/alias/joseph-f.-mccarthy" year: "2006" doi: "http://dx.doi.org/10.1016/j.ipm.2005.05.007" abstract: "A bibliography is traditionally characterized by the judgments, bounded by explicit selection criteria, made by a single compiler. Because these criteria concern the attributes ascribed to a work and the needs of readers, bibliographic work is largely conceptual even across technological eras and domains. Yet, the development of networked information services, made possible by WWW infrastructure, has enabled very large numbers of people to discover, organize, and publish information, including bibliographies. Indeed, bibliographies, or at least bibliography-like artifacts, are a common genre of website, often published by people without specialized skills in information organization who follow non-rigorous selection procedures. Nevertheless, even if the items from these lists are poorly selected and described, this publishing activity is fundamentally important because it structures information locally, creating a patchy network of secondary access points. In turn, these access points enable information discovery, the formation and development of communities of interest, the estimation of document relevance by search engines, and so on. In sum, this activity, and the enabling technical infrastructure, invites bibliographies to take on new interactive possibilities. The aim of this article is to extend the traditional view of bibliography to encompass collaborative possibilities for wide, or narrow, participation in the shaping of bibliographies and the selection of items. This is done by examining the nature of bibliography on the Web, by proposing a conceptual model that opens bibliography to participatory practices, and by discussing a case study where a team sought to develop a bibliography of electronic resources. This examination reveals splendid opportunities for expanding the notion of bibliography with participatory policies while remaining true to its ancient roots." links: doi: "http://dx.doi.org/10.1016/j.ipm.2005.05.007" tags: - "discovery" - "case study" - "bibliography" - "meta-model" - "web service" - "model-driven development" - "source-to-source" - "web services" - "compiler" - "information models" - "Meta-Environment" - "search" - "open-source" researchr: "https://researchr.org/publication/HendryJM06" cites: 0 citedby: 0 journal: "ipm" volume: "42" number: "3" pages: "805-825" kind: "article" key: "HendryJM06" - title: "Digital Libraries and Autonomous Citation Indexing" author: - name: "Steve Lawrence" link: "http://research.google.com/pubs/author103.html" - name: "C. Lee Giles" link: "https://researchr.org/alias/c.-lee-giles" - name: "Kurt D. Bollacker" link: "https://researchr.org/alias/kurt-d.-bollacker" year: "1999" abstract: "The Web is revolutionizing the way researchers access scientific literature, however scientific literature on the Web is largely disorganized. Autonomous citation indexing can help organize the literature by automating the construction of citation indices. Autonomous citation indexing aims to improve the dissemination and retrieval of scientific literature, and provides improvements in cost, availability, comprehensiveness, efficiency, and timeliness." tags: - "digital library" - "C++" - "digital libraries" - "autonomous citation indexing" - "citation indexing" researchr: "https://researchr.org/publication/LawrenceGB%3Acomputer%3A1999" cites: 0 citedby: 0 journal: "Computer" volume: "32" number: "6" pages: "67-71" kind: "article" key: "LawrenceGB:computer:1999" - title: "ParsCit: An open-source CRF reference string parsing package" author: - name: "Councill, I. G." link: "https://researchr.org/alias/councill%2C-i.-g." - name: "Giles, C. L." link: "https://researchr.org/alias/giles%2C-c.-l." - name: "Kan, M. Y." link: "https://researchr.org/alias/kan%2C-m.-y." year: "2008" abstract: "We describe ParsCit, a freely available, open-source implementation of a reference string parsing package. At the core of ParsCit is a trained conditional random field (CRF) model used to label the token sequences in the reference string. A heuristic model wraps this core with added functionality to identify reference strings from a plain text file, and to retrieve the citation contexts. The package comes with utilities to run it as a web service or as a standalone utility. We compare ParsCit on three distinct reference string datasets and show that it compares well with other previously published work." links: "pdf": "http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.150.6790&rep=rep1&type=pdf" tags: - "meta-model" - "web service" - "source-to-source" - "C++" - "context-aware" - "Meta-Environment" - "reference parsing" - "parsing" - "open-source" researchr: "https://researchr.org/publication/councill2008parscit" cites: 0 citedby: 0 booktitle: "Proceedings of LREC" kind: "inproceedings" key: "councill2008parscit" - title: "Automatic Extraction of Bibliographic Information from Biomedical Online Journal Articles Using a String Matching Algorithm" author: - name: "Jongwoo Kim" link: "https://researchr.org/alias/jongwoo-kim" - name: "Daniel X. Le" link: "https://researchr.org/alias/daniel-x.-le" - name: "George R. Thoma" link: "https://researchr.org/alias/george-r.-thoma" year: "2006" doi: "http://doi.ieeecomputersociety.org/10.1109/CBMS.2006.55" links: doi: "http://doi.ieeecomputersociety.org/10.1109/CBMS.2006.55" tags: - "bibliography" researchr: "https://researchr.org/publication/KimLT06" cites: 0 citedby: 0 pages: "905-912" booktitle: "cbms" kind: "inproceedings" key: "KimLT06" - title: "Efficient Identification of Duplicate Bibliographical References" author: - name: "Vinicius Veloso de Melo" link: "https://researchr.org/alias/vinicius-veloso-de-melo" - name: "Alneu de Andrade Lopes" link: "https://researchr.org/alias/alneu-de-andrade-lopes" year: "2005" tags: - "bibliography" researchr: "https://researchr.org/publication/MeloL05" cites: 0 citedby: 0 pages: "169-176" booktitle: "laptec" kind: "inproceedings" key: "MeloL05" - title: "Locating and Parsing Bibliographical References in HTML Medical Articles" author: - name: "Jie Zou" link: "https://researchr.org/alias/jie-zou" - name: "Daniel X. Le" link: "https://researchr.org/alias/daniel-x.-le" - name: "George R. Thoma" link: "https://researchr.org/alias/george-r.-thoma" year: "2009" abstract: "Bibliographical references that appear in journal articles can provide valuable hints for subsequent information extraction. We describe our statistical machine learning algorithms for locating and parsing such references from HTML medical journal articles. Reference locating identifies the reference sections and then decomposes them into individual references. We formulate reference locating as a two-class classification problem based on text and geometric features. An evaluation conducted on 500 articles from 100 journals achieves near perfect precision and recall rates for locating references. Reference parsing is to identify components, e.g. author, article title, journal title etc., from each individual reference. We implement and compare two reference parsing algorithms. One relies on sequence statistics and trains a Conditional Random Field. The other focuses on local feature statistics and trains a Support Vector Machine to classify each individual word, and then a search algorithm systematically corrects low confidence labels if the label sequence violates a set of predefined rules. The overall performance of these two reference parsing algorithms is about the same: above 99% accuracy at the word level, and over 97% accuracy at the chunk level." tags: - "parsing algorithm" - "rule-based" - "classification" - "bibliography" - "machine learning" - "rules" - "search" - "parsing" - "systematic-approach" researchr: "https://researchr.org/publication/Zou2009" cites: 0 citedby: 0 kind: "inproceedings" key: "Zou2009" - title: "On Integrated Bibliography Processing" author: - name: "Michael A. Harrison" link: "https://researchr.org/alias/michael-a.-harrison" - name: "Ethan V. Munson" link: "https://researchr.org/alias/ethan-v.-munson" year: "1988" abstract: "Bibliography processing systems are important to the production of scholarly and technical documents.While the existing systems are a significant aid to authors, their designs are not sufficient to handle the demands that have arisen with their continued use. These demands include larger bibliographic databases, sharing of databases among multiple authors, integration with document editors, and the desire for new features. This paper examines these issues as they are reflected in three enhancements to the bibliography processing facilities of the GNU Emacs BibTEX-Mode and TEX-Mode integrated editing environment. The added featureswere a reference annotation facility, support of formsbased queries for automatic citation, and an enhanced reference inspection facility supporting WYSIWYG display of references. The design and implementation of the three features are discussed in detail. Their relationship to other bibliography processing tools is discussed." tags: - "bibliography" - "bibliographic databases" - "Meta-Environment" - "design" researchr: "https://researchr.org/publication/HarrisonM88" cites: 0 citedby: 0 journal: "epodd" volume: "2" number: "4" pages: "193-209" kind: "article" key: "HarrisonM88" - title: "Subject and citation indexing, Part II: The optimal, cluster-based retrieval performance of composite representations" author: - name: "William M. Shaw Jr." link: "https://researchr.org/alias/william-m.-shaw-jr." year: "1991" tags: - "rule-based" researchr: "https://researchr.org/publication/Shaw91a" cites: 0 citedby: 0 journal: "jasis" volume: "42" number: "9" pages: "676-684" kind: "article" key: "Shaw91a" - title: "Autonomous Citation Matching" author: - name: "Steve Lawrence" link: "http://research.google.com/pubs/author103.html" - name: "C. Lee Giles" link: "https://researchr.org/alias/c.-lee-giles" - name: "Kurt D. Bollacker" link: "https://researchr.org/alias/kurt-d.-bollacker" year: "1999" doi: "http://doi.acm.org/10.1145/301136.301255" abstract: "Advances in computational resources and the communications infrastructure, and the rapid rise of the World Wide Web, have led to the increasingly widespread availability of scientific papers in electronic form. Scientific papers usually contain citations to previous work, and indices of these citations are valuable for literature search, analysis, and evaluation. Current citation indices of the scientific literature are constructed using manual effort and are typically expensive. Part of the reason for using manual effort is the great variability of citation syntax – it can be difficult to autonomously determine if two citations refer to the same article because citations can be written in many different formats. We present machine learning techniques that identify variant forms of citations to the same paper. A number of algorithms are presented. An algorithm based on word and phrase matching is found to perform best, and is sufficiently accurate for unassisted use in an autonomous citation indexing system. An algorithm based on a string edit distance performs poorly in comparison. A computationally efficient subfield algorithm is also presented. The accuracy and efficiency of all algorithms is quantitatively compared on a number of datasets." links: doi: "http://doi.acm.org/10.1145/301136.301255" tags: - "rule-based" - "machine learning" - "analysis" - "C++" - "edit distance" - "search" researchr: "https://researchr.org/publication/LawrenceGB99" cites: 0 citedby: 0 pages: "392-393" booktitle: "agents" kind: "inproceedings" key: "LawrenceGB99" - title: "AUTOBIB: Automatic Extraction of Bibliographic Information on the Web" author: - name: "Junfei Geng" link: "https://researchr.org/alias/junfei-geng" - name: "Jun Yang" link: "https://researchr.org/alias/jun-yang" year: "2004" doi: "http://doi.ieeecomputersociety.org/10.1109/IDEAS.2004.14" links: doi: "http://doi.ieeecomputersociety.org/10.1109/IDEAS.2004.14" tags: - "bibliography" researchr: "https://researchr.org/publication/GengY04" cites: 0 citedby: 0 pages: "193-204" booktitle: "ideas" kind: "inproceedings" key: "GengY04" - title: "The ADS bibliographic reference resolver" author: - name: "Accomazzi, A." link: "https://researchr.org/alias/accomazzi%2C-a." - name: "Eichhorn, G." link: "https://researchr.org/alias/eichhorn%2C-g." - name: "Kurtz, M.J." link: "https://researchr.org/alias/kurtz%2C-m.j." - name: "Grant, C.S." link: "https://researchr.org/alias/grant%2C-c.s." - name: "Murray, S.S." link: "https://researchr.org/alias/murray%2C-s.s." year: "1999" links: "ads": "http://ads.harvard.edu/pubs/resolver/" "pdf": "http://monet.ncsa.uiuc.edu/adass98/Proceedings/reprints/accomazzia.pdf" tags: - "reference resolver" - "bibliography" - "ADS" researchr: "https://researchr.org/publication/accomazzi1999ads" cites: 0 citedby: 0 booktitle: "ASTRONOMICAL SOCIETY OF THE PACIFIC CONFERENCE SERIES" kind: "inproceedings" key: "accomazzi1999ads" - title: "Consolidation of References to Persons in Bibliographic Databases" author: - name: "Nuno Freire" link: "https://researchr.org/alias/nuno-freire" - name: "José Luis Borbinha" link: "https://researchr.org/alias/jos%C3%A9-luis-borbinha" - name: "Bruno Martins" link: "https://researchr.org/alias/bruno-martins" year: "2008" doi: "http://dx.doi.org/10.1007/978-3-540-89533-6_26" abstract: "Entity resolution is the process of determining if, in a specific context, two or more references correspond to the same entity. In this work, we address this problem in the context of references to persons as they are found in bibliographic data, specifically in the case of consolidating multiple datasets. Or solution follows the extraction, transformation and loading (ETL) process, typical in data warehouses. It computes the similarities of the attribute values for the references, and employs a decision tree to decide when the references match. We describe the characteristics of these references within bibliographic datasets, and how we explored those characteristics by developing new similarity metrics to improve the quality of the consolidation process. We evaluated our work by designing an experiment with data from four national libraries. The results show that the proposed similarity metrics contribute significantly to the consolidation process. " links: doi: "http://dx.doi.org/10.1007/978-3-540-89533-6_26" tags: - "bibliography" - "data-flow" - "bibliographic databases" - "context-aware" - "reference resolving" - "transformation" researchr: "https://researchr.org/publication/FreireBM08" cites: 0 citedby: 0 pages: "256-265" booktitle: "ICADL" kind: "inproceedings" key: "FreireBM08" - title: "Linking Domain Models and Process Models for Reference Model Configuration" author: - name: "Marcello La Rosa" link: "https://researchr.org/alias/marcello-la-rosa" - name: "Florian Gottschalk" link: "https://researchr.org/alias/florian-gottschalk" - name: "Marlon Dumas" link: "https://researchr.org/alias/marlon-dumas" - name: "Wil M. P. van der Aalst" link: "http://wwwis.win.tue.nl/~wvdaalst/" year: "2007" doi: "http://dx.doi.org/10.1007/978-3-540-78238-4_43" links: doi: "http://dx.doi.org/10.1007/978-3-540-78238-4_43" tags: - "meta-model" - "Meta-Environment" - "process modeling" researchr: "https://researchr.org/publication/RosaGDA07" cites: 0 citedby: 0 pages: "417-430" booktitle: "BPM" kind: "inproceedings" key: "RosaGDA07" - title: "Associative Document Retrieval Techniques Using Bibliographic Information" author: - name: "Gerard Salton" link: "https://researchr.org/alias/gerard-salton" year: "1963" doi: "http://doi.acm.org/10.1145/321186.321188" abstract: "Automatic documentation systems which use the words contained in the individual documents as a principal source of document identifications may not perform satisfactorily under all circumstances. Methods have therefore been devised within the last few years for computing association measures between words and between documents, and for using such associated words, or information contained in associated documents, to supplement and refine the original document identifications. It is suggested in this study that bibliographic citations may provide a simple means for obtaining associated documents to be incorporated in an automatic documentation system. The standard associative retrieval techniques are first briefly reviewed. A computer experiment is then described which tends to confirm the hypothesis that documents exhibiting similar citation sets also deal with similar subject matter. Finally, a fully automatic document retrieval system is proposed which uses bibliographic information in addition to other standard criteria for the identification of document content, and for the detection of relevant information." links: doi: "http://doi.acm.org/10.1145/321186.321188" tags: - "bibliography" - "information retrieval" - "source-to-source" - "reviewing" - "open-source" researchr: "https://researchr.org/publication/Salton63" cites: 0 citedby: 0 journal: "JACM" volume: "10" number: "4" pages: "440-457" kind: "article" key: "Salton63" - title: "Manifestation of emerging specialties in journal literature: A growth model of papers, references, exemplars, bibliographic coupling, cocitation, and clustering coefficient distribution" author: - name: "Steven A. Morris" link: "https://researchr.org/alias/steven-a.-morris" year: "2005" doi: "http://dx.doi.org/10.1002/asi.20208" links: doi: "http://dx.doi.org/10.1002/asi.20208" tags: - "bibliography" researchr: "https://researchr.org/publication/Morris05" cites: 0 citedby: 0 journal: "jasis" volume: "56" number: "12" pages: "1250-1273" kind: "article" key: "Morris05" - title: "Naive Bayes Classifier for Extracting Bibliographic Information from Biomedical Online Articles" author: - name: "Jongwoo Kim" link: "https://researchr.org/alias/jongwoo-kim" - name: "Daniel X. Le" link: "https://researchr.org/alias/daniel-x.-le" - name: "George R. Thoma" link: "https://researchr.org/alias/george-r.-thoma" year: "2008" tags: - "bibliographic information" - "bibliography" researchr: "https://researchr.org/publication/KimLT08" cites: 0 citedby: 0 pages: "373-378" booktitle: "dmin" kind: "inproceedings" key: "KimLT08" - title: "COBRAS: Cooperative CBR System for Bibliographical Reference Recommendation" author: - name: "Hager Karoui" link: "https://researchr.org/alias/hager-karoui" - name: "Rushed Kanawati" link: "https://researchr.org/alias/rushed-kanawati" - name: "Laure Petrucci" link: "https://researchr.org/alias/laure-petrucci" year: "2006" doi: "http://dx.doi.org/10.1007/11805816_8" abstract: "In this paper, we describe a cooperative P2P bibliographical data management and recommendation system (COBRAS). In COBRAS, each user is assisted by a personal software agent that helps her/him to manage bibliographical data and to recommend new bibliographical references that are known by peer agents. Key problems are: – how to obtain relevant references? – how to choose a set of peer agents that can provide the most relevant recommendations? Two inter-related case-based reasoning (CBR) components are proposed to handle both of the above mentioned problems. The first CBR is used to search, for a given user’s interest, a set of appropriate peers to collaborate with. The second one is used to search for relevant references from the selected agents. Thus, each recommender agent proposes not only relevant references but also some agents which it judges to be similar to the initiator agent. Our experiments show that using a CBR approach for committee and reference recommendation allows to enhance the system overall performances by reducing network load (i.e. number of contacted peers, avoiding redundancy) and enhancing the relevance of computed recommendations by reducing the number of noisy recommendations. " links: doi: "http://dx.doi.org/10.1007/11805816_8" tags: - "rule-based" - "p2p" - "software components" - "bibliography" - "redundancy" - "recommender systems" - "software component" - "data-flow" - "source-to-source" - "peer-to-peer" - "search" - "systematic-approach" researchr: "https://researchr.org/publication/KarouiKP06" cites: 0 citedby: 0 pages: "76-90" booktitle: "ewcbr" kind: "inproceedings" key: "KarouiKP06" - title: "Bibliographic Relationships, Citation Relationships, Relevance Relationships, and Bibliographic Classification: An Integrative View" author: - name: "Jonathan Furner" link: "https://researchr.org/alias/jonathan-furner" year: "2002" tags: - "classification" - "bibliography" researchr: "https://researchr.org/publication/Furner02a" cites: 0 citedby: 0 booktitle: "cr" kind: "inproceedings" key: "Furner02a"