This is a further explaination of the 5 major problems listed here
The patentsview database is built from nearly 2,000 xml files of granted patent's data
. Problem 1: nearly 8,000 patents found in the xml
grant files were subsequently withdrawn
yet they are loaded into the
patentsview database along with the other 6 million or so patents found in the grant xml files. I don't know of another system that returns data for withdrawn patents.
Problem 5: there are 305 non withdrawn patents missing in the grant xml files which means they are also missing in the patentsview database. Interestingly, 182 of the
missing patents are present in the uspto's other api Patent Examination Data System
There is classification data in the grant xml files but classifications can change after a patent is issued. As a result, the uspto makes bulk classification files available
approximately quarterly. The patentsview team uses the bulk classification files when building its database.
There are two main classification systems being used. One is the relativley new CPC or Cooperative Patent Classification system. Problem 2: plant patents and reissued
patents can be classified using CPCs yet the bulk cpc file only contains classifications for utility patents. As a result, the patentsview database only contains
CPC assignments for utility patents. There is also a problem with loading grant xml files more recent than covered by the bulk classification files. Example: a recent update
loaded patent data through 2018-11-27 but the most recent bulk uspc file only has data through 2018-04-24. As a result, the last seven months of reissued and plant patents
in the patentsview database do not have uspc classifications (Problem 4).
The most recent bulk cpc file has data through 2018-11-27 but for some reason cpcs are not coming back on the most recently issued patents
in the patentsview database.
Perhaps the load was mistakenly done without first retrieving the most recent bulk cpc file. Or perhaps the grant xml files were processed before 2019-02-05 which is when the latest bulk
cpc file was made available.