knowledge extraction?
Jul. 11th, 2013 08:06 amSo, I've read the book on Tika; it is kind of better than UIMA, but what does it do? It successfully extracts meta info and the contents of HDF and RSS. Excuse me, all this information is already structured for extraction, what's the point? What I am looking for is a way of extracting content from web pages... my case is kind of specific; working on it.
Example:
====================
As a result, I get something like this:
Example:
====================
Deductible
Maximum | Spent | Left | ||
---|---|---|---|---|
In Network | Member: | $3,000.00 | $1,652.40 | $1,347.60 |
Family: | $7,000.00 | $1,652.40 | $5,347.60 | |
Out of Network | Member: | $6,000.00 | $0.00 | $6,000.00 |
Family: | $12,000.00 | $0.00 | $12,000.00 |
As a result, I get something like this:
Good(( fp(Map(Limit.In Network.Family -> $3,000.00, Remaining Balance.Out of Network.Member -> $3,000.00, Remaining Balance.In Network.Member -> $350.84, Remaining Balance.Out of Network.Family -> $6,000.00, Remaining Balance.In Network.Family -> $1,688.87, Limit.Out of Network.Member -> $3,000.00, Accumulated.Out of Network.Member -> $0.00, Accumulated.In Network.Member -> $1,149.16, Limit.Out of Network.Family -> $6,000.00, Limit.In Network.Member -> $1,500.00, Accumulated.Out of Network.Family -> $0.00, Accumulated.In Network.Family -> $1,311.13)) with prefix <<Deductible>>) ++ (fp(Map(Limit.In Network.Family -> $10,000.00, Remaining Balance.Out of Network.Member -> $9,000.00, Remaining Balance.In Network.Member -> $3,550.84, Remaining Balance.Out of Network.Family -> $18,000.00, Remaining Balance.In Network.Family -> $8,304.22, Limit.Out of Network.Member -> $9,000.00, Accumulated.Out of Network.Member -> $0.00, Accumulated.In Network.Member -> $1,449.16, Limit.Out of Network.Family -> $18,000.00, Limit.In Network.Member -> $5,000.00, Accumulated.Out of Network.Family -> $0.00, Accumulated.In Network.Family -> $1,695.78)) with prefix <<Out of Pocket>>))