Please KIS Me

"Half a hectare of land and one year of labour were required to feed one person in 1900 whereas that same half-hectare now feeds 10 persons on the basis of just one and a half days of labour. The difference lies in the scientific knowledge[...]" UNESCO Science Report 2005

Tuesday, May 13, 2008

More on Thesis Sources

Patents, patents, patents!

I have been working a bit more on the patents side of my thesis. After the previous script that retrieved the data from the Wipo website on a patents query search, I have developed another script to import this data into a database. I have used a PostgreSQL DB although I have not been able to integrate it directly to Python, I have used ODBC connections instead. I guess that but the connection parameters it is suitable for any ODBC ready database.

It creates two tables. One with all the patents information and another one that relates this patents to the search terms that resulted on it. I paste here the definition of both tables:

-- Table: patents

-- DROP TABLE patents;

CREATE TABLE patents
(
p_id integer NOT NULL,
p_code character(20),
pub_dated date,
description character(512),
applicant character(1000),
intl_class character(15),
appl_number character(20),
url character(500),
abstract character(3000),
CONSTRAINT firstkey PRIMARY KEY (p_id)
)
WITH (OIDS=FALSE);
ALTER TABLE patents OWNER TO postgres;

-- Index: codeidx

-- DROP INDEX codeidx;

CREATE UNIQUE INDEX codeidx
ON patents
USING btree
(p_code);



-- Table: patent_query

-- DROP TABLE patent_query;

CREATE TABLE patent_query
(
recid integer NOT NULL,
pat_id integer,
query_terms character(256) NOT NULL,
CONSTRAINT patent_query_pkey PRIMARY KEY (recid),
CONSTRAINT patent_query_pat_id_fkey FOREIGN KEY (pat_id)
REFERENCES patents (p_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (OIDS=FALSE);
ALTER TABLE patent_query OWNER TO postgres;

-- Index: queryidx

-- DROP INDEX queryidx;

CREATE UNIQUE INDEX queryidx
ON patent_query
USING btree
(pat_id, query_terms);



I have uploaded the second script to the site and can be found here, along with the previous one.

Labels: , , , ,

Sunday, May 04, 2008

WIPO Python document retriever

Working on the sources for my MSc. Thesis I started to have a look to the patent finder of the WIPO (World Intellectual Property Organization). As it is the world database and the topic I am covering is quite globalised I think that it is the best IPR database to do the analysis. I need to do such an analysis of several searches on the Wipo database and I have not close the terms of the final search I have to do. So, I have to perform previous searches, analyze the number of records per year, applicant... and then decide if the terms are OK.

The point is that the way the records are retrieved is not useful enough. Thus I have written a Python script that retrieves the records that match the terms of search and stores them in a delimiter separated file. This file can be imported in a calc sheet and analyzed with the usual tools, or even exported to any database format to do advanced treatment of the information.

I put it in online under a Creative Commons license, as usual. I will update the code uploaded if I make further developments.

Here is a sample of what you would get if you execute:

C> wiposearchretrieve.py "Smartcard"

Query Params|Record Id.|Patent Code|Publication Date|Description|International Class|Application Number|Applicant Name|url|Abstract
SIM+Card|1|WO 2008/052205|02.05.2008|COMBINED ALGORITHMIC AND EDITORIAL-REVIEWED MOBILE CONTENT SEARCH RESULTS|G06Q 10/00|PCT/US2007/082754|JUMP TAP, INC.|http://www.wipo.int/pctdb/en/fetch.jsp?LANG=ENG&DBSELECT=PCT&SERVER_TYPE=19-10&SORT=41236014-KEY&TYPE_FIELD=256&IDB=0&IDOC=1451431&C=10&ELEMENT_SET=BASICHTML-ENG&RESULT=1&TOTAL=3280&START=1&DISP=500&FORM=SEP-0/HITNUM,B-ENG,DP,MC,AN,PA,ABSUM-ENG&SEARCH_IA=US2007082754&QUERY=%22SIM+Card%22|In embodiments of the present invention improved capabilities are described for reviewing mobile content to determine relevance such that presenting the reviewed content to a mobile communication facility may be based at least in part on the relevance. The reviewed content may be subject to an algorithmic review, an editorial review, or a combined algorithmic- editorial review. The reviewed content may be blacklisted and prevented from being presented, whitelisted and permitted to be presented to the mobile communication facility, or given a relevance score based at least in part on appropriateness of content. Portions of the reviewed content may be removed or replaced with appropriate content prior to presentation to the user.
SIM+Card|2|WO 2008/052100|02.05.2008|PORTABLE MULTIFUNCTION DEVICE, METHOD, AND GRAPHICAL USER INTERFACE FOR ADJUSTING AN INSERTION POINT MARKER|G06F 3/048|PCT/US2007/082486|APPLE INC.|http://www.wipo.int/pctdb/en/fetch.jsp?LANG=ENG&DBSELECT=PCT&SERVER_TYPE=19-10&SORT=41236014-KEY&TYPE_FIELD=256&IDB=0&IDOC=1451325&C=10&ELEMENT_SET=BASICHTML-ENG&RESULT=2&TOTAL=3280&START=1&DISP=500&FORM=SEP-0/HITNUM,B-ENG,DP,MC,AN,PA,ABSUM-ENG&SEARCH_IA=US2007082486&QUERY=%22SIM+Card%22|In accordance with some embodiments, a computer-implemented method is performed at a portable electronic device with a touch screen display. The method includes: displaying graphics and an insertion marker at a first location in the graphics on the touch screen display; detecting a finger contact with the touch screen display; and in response to the detected finger contact, expanding the insertion marker from a first size to a second size on the touch screen display and expanding a portion of the graphics on the touch screen display from an original size to an expanded size. The method further includes detecting movement of the finger contact on the touch screen display and moving the expanded insertion marker in accordance with the detecte...
SIM+Card|3|WO 2008/051718|02.05.2008|METHOD, SYSTEM, AND GRAPHICAL USER INTERFACE FOR MAKING CONFERENCE CALLS|H04M 1/247|PCT/US2007/080971|APPLE INC.|http://www.wipo.int/pctdb/en/fetch.jsp?LANG=ENG&DBSELECT=PCT&SERVER_TYPE=19-10&SORT=41236014-KEY&TYPE_FIELD=256&IDB=0&IDOC=1450639&C=10&ELEMENT_SET=BASICHTML-ENG&RESULT=3&TOTAL=3280&START=1&DISP=500&FORM=SEP-0/HITNUM,B-ENG,DP,MC,AN,PA,ABSUM-ENG&SEARCH_IA=US2007080971&QUERY=%22SIM+Card%22|A user interface for handling multiple calls includes displaying an image associated with a first party on a first call and an image associated with a second party on a second call. When one call is active and the other call is on hold, the image associated with the party that is on the active call is visually highlighted to make it more visually prominent relative to the other image. When both calls are joined into a conference call, both images are displayed adjacent to each other and neither is visually highlighted relative to the other.
...
...

Labels: , , , , , , , , ,

Friday, May 02, 2008

Sources of information

It is interesting to see how hard is to get the right sources for developing a rigorous work on any subject. Let's say that sources are half of the final result. Of course, the effort is not half, just like buying good bricks is not half of the effort of building a house, but a bad election of bricks would have dramatic consequences in the final result.

In my work regarding the knowledge production on the SIM technology, I have found some databases that would help in the retrieving of records related with the research. I have mainly found in the UPC library page. I have also checked in the UB and UOC library, as I have remote access for this libraries only, and almost them all are accessible from them.

Additionally I have started using del.icio.us, the fantastic online bookmarks manager. I am not a fan of bookmarks thus the elementary possibilities provided by firefox sufficed my needs, but now that I have to look for interesting and locatable sources in the Internet, it is a very powerful tool.

One of the problem of creating a knowledge database is the format. First I started with MSWord, the I switched to BibTex, and finally I did it in Excel. I will combine the use of Excel with Python scripts to be able to convert between this format and the others... let's see how it works ;-)

Labels: , , , , , , , , ,