Patentsview Data Gaps
This is a further explaination of the 5 major problems listed here
The root cause is that there are gaps in the bulk data the uspto makes available.
The patentsview database is built from nearly 2,000 xml files of granted patent's data¹
The problem is that there are 305 patents missing in the grant xml files which means they are also missing in the patentsview database. Interestingly, 182 of the
missing patents are present in the uspto's other api Patent Examination Data System
There is classification data in the grant xml files but classifications can change after a patent is issued. As a result, the uspto makes bulk classification files available
approximately quarterly. The patentsview team uses the bulk classification files when building its database.
There are two main classification systems being used. One is the relativley new CPC or Cooperative Patent Classification system. Problem: plant patents and reissued
patents can be classified using CPCs yet the bulk cpc file only contains classifications for utility patents. As a result, the patentsview database only contains
CPC assignments for utility patents.
¹Patent Grant Full Text Data (No Images) (JAN 1976 - PRESENT) files