#! /usr/bin/env python
# wiposearchretrieve.py
#
# Author: Pedro Hernandez. May, 2008. v1.0
#
# Retrieves the search result on the wipo (World Intellectual Property Organization)
# website as a delimiter separated file.
#
# It has one command line parameter that is the terms of search
# if there are spaces it must be entered between quotation
# marks "Bla Bla"
#
# A sample of what we get if we execute
# c> wiposearchretrieve.py "Smartcard"
#Query Params|Record Id.|Patent Code|Publication Date|Description|International Class|Application Number|Applicant Name|url|Abstract
#Smartcard|1|WO 2008/051999|02.05.2008|CONTACTLESS |G07F 7/00|PCT/US2007/082274|MEI, INC.|http://www.wipo.int/pctdb/en/fetch.jsp?LANG=ENG&DBSELECT=PCT&SERVER_TYPE=19-10&SORT=41228299-KEY&TYPE_FIELD=256&IDB=0&IDOC=1451224&C=10&ELEMENT_SET=BASICHTML-ENG&RESULT=1&TOTAL=1847&START=1&DISP=500&FORM=SEP-0/HITNUM,B-ENG,DP,MC,AN,PA,ABSUM-ENG&SEARCH_IA=US2007082274&QUERY=%22Smartcard%22|A multi media payment device includes a banknote acceptor and a RF card reader, and also may include a magnetic card reader. A bezel assembly for connection to the bill acceptor preferably includes a reader unit to read magnetic swipe cards and contactless chip cards.
#Smartcard|2|WO 2008/051982|02.05.2008|CONTENT OWNER VERIFICATION AND DIGITAL RIGHTS MANAGEMENT FOR AUTOMATED DISTRIBUTION AND BILLING PLATFORMS|H04M 3/42|PCT/US2007/082250|SMS.AC|http://www.wipo.int/pctdb/en/fetch.jsp?LANG=ENG&DBSELECT=PCT&SERVER_TYPE=19-10&SORT=41228299-KEY&TYPE_FIELD=256&IDB=0&IDOC=1451207&C=10&ELEMENT_SET=BASICHTML-ENG&RESULT=2&TOTAL=1847&START=1&DISP=500&FORM=SEP-0/HITNUM,B-ENG,DP,MC,AN,PA,ABSUM-ENG&SEARCH_IA=US2007082250&QUERY=%22Smartcard%22|Software application providers can connect to a common platform in order to offer access to and use of their applications and/or content to a global community of mobile device users through a variety of different media. The users are automatically charged via the user's billing account with the wireless network carrier to which the user subscribes. The platform can also use billing mechanisms to bill the user other than the user's wireless network carrier, such as credit cards, bank accounts, prepaid cards, web-based payment services, etc. The application provider need not have contractual agreements with any of the wireless network carriers, as billing is automatically performed by the platform through the wireless network carriers his or ...
#Smartcard|3|WO 2008/051694|02.05.2008|SYSTEM AND METHOD FOR DEVELOPING AND MANAGING GROUP SOCIAL NETWORKS|G06F 3/00|PCT/US2007/080527|INSTABUDDY LLC|http://www.wipo.int/pctdb/en/fetch.jsp?LANG=ENG&DBSELECT=PCT&SERVER_TYPE=19-10&SORT=41228299-KEY&TYPE_FIELD=256&IDB=0&IDOC=1450550&C=10&ELEMENT_SET=BASICHTML-ENG&RESULT=3&TOTAL=1847&START=1&DISP=500&FORM=SEP-0/HITNUM,B-ENG,DP,MC,AN,PA,ABSUM-ENG&SEARCH_IA=US2007080527&QUERY=%22Smartcard%22|A system and method for facilitating the configuration and management of events within a social networking system is disclosed. The system enables members of similar or different geographic region and/or like interests, hobbies, social status, relationship status, family status, etc. to interact with the system to view activities, register to participate in activities, and schedule activities. A personal workspace, accessible through a variety of devices (e.g., kiosks, web clients, wireless devices, and set-top boxes) enables network members to view a personal calendar, scheduled events and activities, invitations, localized news, and the like. The personal workspace further facilitates registration to participate in scheduled activities. A...
#Smartcard|4|WO 2008/051335|02.05.2008|TRANSACTION PROCESSING METHOD|G06Q 10/00|PCT/US2007/019821|WELLS, R., Scott|http://www.wipo.int/pctdb/en/fetch.jsp?LANG=ENG&DBSELECT=PCT&SERVER_TYPE=19-10&SORT=41228299-KEY&TYPE_FIELD=256&IDB=0&IDOC=1448744&C=10&ELEMENT_SET=BASICHTML-ENG&RESULT=4&TOTAL=1847&START=1&DISP=500&FORM=SEP-0/HITNUM,B-ENG,DP,MC,AN,PA,ABSUM-ENG&SEARCH_IA=US2007019821&QUERY=%22Smartcard%22|The transaction processing method is a computer-implemented method capable of logging events related to a consumer at a point of transaction 100. The event logging is performed at a transaction processing center 120. The transaction processing center 120 can log such events as: receipts generated by a plurality of merchants doing business with the consumer; cash transactions generated at a plurality of cash transaction venues visited by the consumer; credit transactions generated by a plurality of creditors of the consumer; and non-financial events associated with the consumer. The events are reported to the transaction processing center 120 by a plurality of associate members who contract with the center 120 to provide the data. For each c...
#...
# Enjoy it!
import httplib, mimetypes, sys, os
def post(host, selector, body):
"""
Quite simple post that sends the information passed by parameter
"""
h = httplib.HTTP(host)
h.putrequest('POST', selector)
h.putheader('content-type', 'application/x-www-form-urlencoded')
h.putheader('content-length', str(len(body)))
h.endheaders()
h.send(body)
errcode, errmsg, headers = h.getreply()
return h.file.read()
# This is the delimiter that will be used in the final file. You can use
# either a semicolon ";" or something like that.
delimiter = "|"
# Query is the value to be queried to the database. If no parameter is
# supplied...
if sys.argv[1:] == []:
print "No data to query the database"
exit(1)
query = sys.argv[1]
# We subst the spaces by + signs and put quotation notes in the first
# and last digits (requirement of the search engine of the WIPO)
for idx in range (0,len(query)):
if query[idx] == " ": query = query[0:idx] + "+" + query[idx+1:len(query)]
queryorig = query
query = "%22" + query + "%22"
# the Output filename for the html (byproduct) result
filenameout = "ResultWipo" + query + ".html"
# this parameter should not be changed. Is the max number of records retrieved
# by post http method
displaycount = "500"
# These are the parameters that must be passed to the server for the query
body = "LANGUAGE=ENG&SERVER_TYPE=19&DBSELECT2=SPECIFY&DBSELECT=PCT&TYPE_FIELD=256&C=10&RANKTYPE=KEY&QUERY=" + query + "&ELEMENT_SET=BASICHTML-ENG&BRIEF_ELEMENT_SET=HITNUM%2CB-ENG%2CDP%2CMC%2CAN%2CPA%2CABSUM-ENG&SEPDISPLAY=FALSE&DISPLAYCOUNT=" + displaycount
# Open the output file for the html result
fout=open(filenameout, 'w')
# We do the first query
queryresult = post("www.wipo.int","/pctdb/cgi/guest/search5",body)
# And write it to the intermediate file
fout.write(queryresult)
rec = 1
# Now we get the number of records that meet the search criteria
maxrec_f = queryresult.find("records
")
maxrec_i = queryresult.find(": ",maxrec_f - 20, maxrec_f) + len(": ")
maxrec = int(queryresult[maxrec_i:maxrec_f])
print str(maxrec) + " records retrieved in " + str(displaycount) + " entries files"
# These is the number of records already retrieved
rec = rec + int(displaycount)
# Now, while the number of records retrieved is less than
# the number of records meeting the criteria...
while rec in range (1,maxrec):
# Prepare the next query
startrec = rec
endrec = rec + int(displaycount)
if endrec > maxrec : endrec = maxrec # If we are to retrieve the last batch of records
body = "LANGUAGE=ENG&SERVER_TYPE=19&DBSELECT=PCT&TYPE_FIELD=256&C=10&RANKTYPE=KEY&QUERY=" + query + "&ORIG_QUERY=" + query +"&START=" + str(startrec) + "&END=" + str(endrec) + "&ELEMENT_SET=BASICHTML-ENG&BRIEF_ELEMENT_SET=HITNUM%2CB-ENG%2CDP%2CMC%2CAN%2CPA%2CABSUM-ENG&SEPDISPLAY=FALSE&DISPLAYCOUNT=" + displaycount
queryresult = post("www.wipo.int","/pctdb/cgi/guest/irange5",body)
# Write the next batch to the intermediate html file
fout.write(queryresult)
rec = endrec
print "retrieving done. Formatting files"
# Close the intermediate file
fout.close()
# Now we reopen the html file in read mode
fin = open(filenameout, 'r')
# And the final file in write mode
fcsv = open(filenameout+".txt", 'w')
contents = fin.read()
# First we set a limit for the record
recorddelimiter = '