2018

Getting started with ORCID using the rorcid package. A practical introduction to searching for researcher identifiers with the ORCID API and retrieving publication data with rorcid, purrr and rcrossref. When working with data tables from the scientific or patent literature, columns often contain concatenated data. Here’s a tidy approach to dealing with concatenation and exploration of why trimming white space really matters Quick API resources to search for scientific publications in R or Python. Creating bibliographies is a pain but the knitr and the rcrossref packages in RStudio will ease your suffering. Exploring geocoding 5000 organisation names from Web of Science in R with the Google Maps API using the placement, ggmap and googleway packages. For accurate patent statistics you need to be able to identify the earliest priority filings. This article shows you how with R as part of work for the WIPO Patent Analytics Handbook. If you are looking for how to import Excel into R… read this update. It’s now much easier to import and the post also covers the dreaded https problem and using writexl. A quick and mildly evil guide to the @ROpensci robotstxt package

2016

In this article we will use RStudio to prepare patent data for visualisation in an infographic using the online software tool infogram. Infographics are a popular way of presenting data in a way that is easy for a reader to understand without reading a long report. Infographics are well suited to …

2015

In this chapter we look at the use of the rplos package from rOpenSci to access the scientific literature from the Public Library of Science using the PLOS Search API. The Public Library of Science (PLOS) is the main champion of open access peer reviewed scientific publications and has published … One problem for people seeking to learn patent analytics is a lack of access to patent data from different sources. In this article I introduce the patent datasets developed for the WIPO Open Source Patent Analytics Project as training sets for patent analytics. The datasets will be used in the … In this article we provide a quick introduction to the online graphing service Plotly to create graphics for use in patent analysis. Plotly is an online graphing service that allows you to import excel, text and other files for visualisation. It also has API services for R, Python, MATLAB and a … This article provides an overview of the open source and free software tools that are available for patent analytics. The aim of the chapter is to serve as a quick reference guide for some of the main tools in the tool kit. This article is now a chapter in the WIPO Manual on Open Source Patent … In this article we provide a brief introduction to The Lens patent database as a free source of data for patent analytics. The Lens is a patent database based in Australia that describes itself as “an open global cyberinfrastructure to make the innovation system more efficient and fair, more … This is Part 2 of an article introducing R for patent analytics that focuses on visualising patent data in R using the ggplot2 package. In Part 1 we introduced the basics of wrangling patent data in R using the dplyr package to select and add data. In this article we will go into more detail on … Patentscope is the WIPO public access database. It includes coverage of the Patent Cooperation Treaty applications (administered by WIPO) and a wide range of other countries including the European Patent Office, USPTO and Japan totalling 51 million patent documents including 2.8 million PCT … In this article we will be analysing and visualising patent data using Tableau Public. Tableau Public is a free version of Tableau Desktop and provides a very good practical introduction to the use of patent data for analysis and visualisation. In many cases Tableau Public will represent the … This article provides a quick overview of some of the main sources of free patent data. It is intended for quick reference and points to some free tools for accessing patent databases that you may not be familiar with. This article is now a chapter in the WIPO Manual on Open Source Patent Analytics. … This article focuses on visualising patent data in networks using the open source software Gephi. Gephi is one of a growing number of free network analysis and visualisation tools with others including Cytoscape, Tulip, GraphViz, Pajek for Windows, and VOSviewer to name but a few. In addition, … This article provides a walk through of patent data fields for those who are completely new to patent analytics or want to understand the workings of patent data a little bit better. A video version of the walk through is available here and the slide deck is available for download in .pdf, … Cleaning patent data is one of the most challenging and time consuming tasks involved in patent analysis. In this chapter we will cover. […] Open Refine is an open source tool for working with all types of messy data. It started life as Google Refine but has since migrated to Open Refine. It … This post was updated in 2018 and you can read it here The CRAN Project has the following to say about importing Excel files into R. “The first piece of advice is to avoid doing so if possible! If you have access to Excel, export the data you want from Excel in tab-delimited or comma-separated form, … This post is now showing its age and was the first thing I wrote about R. Everything still works but readr, which was brand new at the time, has made a big difference. I now suggest the importing local csv files into RStudio using File > Import > From Text (readr) as it is by far the easiest …