Competitive analysis using bulk US patent data

In 2010, the United States Patent and Trademark Office agreed to let Google books handle bulk downloads of patents and trademarks for the public – at no cost to the end user. This means that the user has access to all of the USPTO publications – including patent and trademark applications. A single point of access provided to the public will allow the user to view patent applications from 2001 to the present – and know what their competitors are brewing for the future.

The bulk downloads are available as zip files, and the full text files are around 85MB and upwards. The files nested in the zip files are in XML format. Naturally, the files will not provide non-published patent applications, but will deliver anything that is published, and, of course, all granted patents and trademarks. The patent applications are available as full text, full text with embedded images, and multiple page images.

The patent applications files are kept updated, with new submissions posted every week. They also go back all the way to 2001. The trademark applications include daily front files and annual back files that date back all the way to 1884. Furthermore, the information provided for the trademarks include goods and services as well as serial numbers and filing dates.

They are also searchable. However, the results returned may not be complete, and for complete information, you should still go directly to the USPTO Patent Application Information Retrieval (PAIR) site (to gain access, a CAPTCHA will have to be answered).

A wealth of information is provided in both the patent and trademark bulk downloads that with the right data mining approach, can be very useful for competitive intelligence.