704 W Park Ave Suite C
 Edgewater, FL 32132-1409
 Ph  800-832-2823
 Fx   208-631-6381
 Outside the US:
 01-386-426-5393

IP Data Corporation

Manuals, Patents, File Histories, DATA SERVICES

We currently convert patent text data from 15 different formats delivered from the various patent authorities into a single, coherent, easy-to-us format that we call MAPS (Modified APS) and now MAPS-XML (MAPS flavored XML with a complete, well commented DTD).  APS was the original USPTO mainframe, 80 column, line-oriented data storage format and is still available in the ASCII character set for issue weeks from 1976 through 1999 from various sources for free (the raw APS data contains over 1600 data errors, but you'll find them. After all, we did (over a period of 5 years, with many errors reported by our clients, for which we are very thankful!). 

Even though WIPO produced and maintains an agreed-upon XML standard for patent text data (Currently Standard ST.36), the data sets from all of the authorities that use it are sufficiently different to require hundreds of exceptions if you were to use a single parser to read and index or convert the data from multiple authorities to another format. We have handled the parsing and conversion of all of the different sets in a modular fashion with multiple front-end modules (reader/parser), character set conversion pipe (in-convert-out), symbol conversion pipe(s), OCR Cleanup and Dictionary Pipes, and a final output modules for the desired destination format. We can also handle multiple inputs in the weekly flow with a single output with various character conversion or OCR Cleanup pipes as shown in this diagram:


If you have data you need converted to or from various formats, we no doubt already have what you need to handle the job. We can also tailor it to make it easy to add to your work flow.

Contact us and let us know the following particulars:

  • Source Format (specification), 
  • Source character set and language,
  • Number of Source publications,
  • How they are grouped when stored (singe files or multiple pubs per physical file),
  • Total storage size on disk of Source data,
  • Destination Format (specification),
  • Destination character set desired,
  • Any additional translations required such as:
  •    HTML Entities to UTF-8 characters or HTML format (ex: X&sup2; to X<sup>2</sup> )
  •    HTML Entities to Plain Text (words for scientific symbols or characters)
  •    UTF or ISO characters to Text or HTML Entities,
  • Additional character data translations and insertions for indexing such as Scientific Symbols to plain text name following them parenthetically, for example:  Å (Angstrom)   Ø (Phase) 
  • Any reports required such as lists of all scientific symbols or conversions, and
  • Anything else you can think of that you may require.

The bottom line is, we can probably save you money and time. Give us a call at one of the above numbers, or send an email with a brief description to IPDataCorp.com with the user name Support and put Data Conversion Info Request in the subject line, and if you provide enough detail about the data we will let you know what we can do for you, provide you with "ball park cost" and may be able to provide an estimated completion time. Then, you can decide if you'd like a formal quote.

                                                             * * * * *


ITEMS OF INTEREST


Cooperative Patent Classification System

UPDATE


In September 2014, the USPTO and KIPO (Korean IP Office) inked an agreement to work together on the CPC. KIPO will fully CPC classify all new patent applications and utility models, but no mention of back-file data. The CPC system is gaining wide acceptance. It definitely has the detail needed to replace the US system, and then some. We're still studying and learning the details, and the USPTO decision for the two year overlap (CPC/USC) makes more sense to us (it gives us more time, too. 1 year down, 1 to go). The EPO seems to have completed most, if not all of the their ECLA to CPC translations and CPC assignments, and unlike the ECLA data, the EPO's CPC data have been added to DOCDB for all countries they handle. We think it's a wonderful addition to the data, and our compliments and thanks go to the EPO!