New computer techniques to analyze historic Hebrew, Arabic documents under development
August 14, 2009Researchers at Ben-Gurion University of the Negev (BGU) will combine the scientific and scholarly expertise of their humanities and computer science experts in a new project to analyze degraded Hebrew documents.
The effort to develop new computer algorithms combines BGU's scientific expertise in computer vision, computer graphics, image processing and computational geometry with the scholarly expertise of historians and liturgy scholars to provide valuable answers regarding Jewish liturgical texts and Arabic historical texts that advance scholarship in these fields.
The technical goal of the research is to develop new state of the art algorithms for analyzing text and combine them into an easy to operate, open source system of tools to aid historical document research throughout the world.
Experiments are being conducted on degraded documents from sources such as the Cairo Geniza, copies of which are located at the national liturgy project at BGU, the El-Aqsa manuscript library in Jerusalem and the Al-Azar manuscript library in Cairo. Most fragments that have been discovered at the Geniza are now in libraries at Cambridge and Oxford universities, the Jewish Theological Seminary in New York, The British Library and in Israel and Paris.
Until now the documents have not been researched systematically. Prof. Uri Ehrlich of the Goldstein-Goren Department of Jewish Thought is the head of the Prayer Research Project at BGU. He explains that, "There was one book that was originally used as a Hebrew prayer book from the 12th century, but had been scratched off, and the parchment used to write an Arabic text (called a palimpsest). Our aim was to read the first book and not the second book. So we needed to find out how the Arab book could disappear and would leave only the Hebrew letters of the original book. This is why the computer sciences and humanities departments at BGU decided to collaborate."
"To solve the problem, we created an algorithm to cover the text in a dark grey color, which then highlights lighter colored pixels as background space and identifies the darker pixels as outlining the original Hebrew lettering," said Prof. Klara Kedem of the Department of Computer Sciences and one of the system's creators.
Many of the new methods will apply to other languages as well, including binarization of highly degraded documents (converting up to 256 grey colors to black and white to facilitate digitization), segmentation of skewed and curved lines and word spotting in both curved and highly degraded documents. Other algorithms will be more language specific, such as paleographic analysis of Hebrew and Arabic historical documents that will include automatic indexing of document collections, determining authorship, location and date of the documents.
The research is being funded by the Israel Science Foundation (ISF). Prof. Ehrlich and other BGU scholars in the humanities will be among those to evaluate the system to be built by Prof. Klara Kedem and Dr. Jihad El-Sana of the Department of Computer Sciences and Prof. Emeritus Tsiki Dinstein from Electrical Engineering.
The group is part of the emerging global effort to understand, manipulate and archive historical documents so that they are available to researchers in paleography, archaeology and historical research.
Source: American Associates, Ben-Gurion University of the Negev (news : web)
-
Piecing together the Medieval Middle East
Apr 10, 2006 |
not rated yet |
0
-
New search engine ranks tables by title, document content, text reference
Aug 08, 2007 |
not rated yet |
0
-
Of mice and men: similarities between skeletons of both
Oct 10, 2007 |
not rated yet |
0
-
UCLA team creates virtual library of medieval manuscripts
Feb 10, 2009 |
not rated yet |
0
-
Researchers develop new reversible, green window technology
Mar 03, 2009 |
not rated yet |
0
-
Engineers build first sub-10-nm carbon nanotube transistor
Feb 01, 2012 |
4.9 / 5 (33) |
30
-
Something old, something new: Evolution and the structural divergence of duplicate genes
Jan 31, 2012 |
4.6 / 5 (7) |
1
-
The hidden nanoworld of ice crystals: Revealing the dynamic behavior of quasi-liquid layers
Jan 30, 2012 |
5 / 5 (4) |
1
-
Stock market network reveals investor clustering
Jan 27, 2012 |
3.9 / 5 (23) |
8
-
Of microchemistry and molecules: Electronic microfluidic device synthesizes biocompatible probes
Jan 26, 2012 |
5 / 5 (2) |
0
-
Synergistic relations between computer science and technology.
Feb 06, 2012
-
how do iphone gloves work?
Feb 05, 2012
-
iPhone battery over time
Jan 30, 2012
-
Best alternate Tablet to an iPad for writing math or physics equations?
Jan 26, 2012
-
Sending SMS to a website
Jan 20, 2012
-
Need help with my technical fest!
Jan 19, 2012
- More from Physics Forums - Computing & Technology
More news stories
Google might launch Drive for cloud storage soon
(PhysOrg.com) -- Google's next big move, according to the Wall Street Journal, is a cloud storage service called Drive. Hardly first to the plate, Google is simply catching up to introducing its cloud reposi ...
Iran blocks email, restricts net access: reports
Iran has further restricted access to the Internet and blocked popular email services for the past few days, in a move a top lawmaker said could "cost the regime dearly," media reports said on Sunday.
5 hours ago |
5 / 5 (1) |
3
Love a click away in Indonesia's Twitter Republic
He was a geeky kid from Yogyakarta, she a glamorous city girl in Jakarta. In a country with one of the world's most vibrant social networking scenes they fell in love on Twitter.
13 hours ago |
4 / 5 (1) |
0
Walney offshore wind farm is world's biggest (for now)
(PhysOrg.com) -- The Walney wind farm on the Irish Sea--characterized by high tides, waves and windy weather--officially opened this week. The farm is treated in the press as a very big deal as the Walney ...
Navy to begin tests on electromagnetic railgun prototype launcher
The Office of Naval Research (ONR)'s Electromagnetic (EM) Railgun program will take an important step forward in the coming weeks when the first industry railgun prototype launcher is tested at a facility ...
Feb 06, 2012 |
4.5 / 5 (17) |
94
|
Declining health-care productivity in England: Who says so?
Reports that the National Health Service in England has been declining in productivity in the last decade appear to have been accepted as fact. However, a Viewpoint published Online First by The Lancet disputes this. The Vi ...
Scientists discover molecular secrets of 2,000-year-old Chinese herbal remedy
For roughly two thousand years, Chinese herbalists have treated Malaria using a root extract, commonly known as Chang Shan, from a type of hydrangea that grows in Tibet and Nepal. More recent studies suggest that halofuginone, ...
New method to examine batteries -- MRI from the inside
There is an ever-increasing need for advanced batteries for portable electronics, such as phones, cameras, and music players, but also to power electric vehicles and to facilitate the distribution and storage of energy derived ...
Overeating may double risk of memory loss
New research suggests that consuming between 2,100 and 6,000 calories per day may double the risk of memory loss, or mild cognitive impairment (MCI), among people age 70 and older. The study was released today and will be ...
A mitosis mystery solved: How chromosomes align perfectly in a dividing cell
Although the process of mitotic cell division has been studied intensely for more than 50 years, Whitehead Institute researchers have only now solved the mystery of how cells correctly align their chromosomes during symmetric ...
Lab study raises questions over nano-particle impact
Tests involving chickens have raised questions about the impact on health from engineered nano-particles, the ultra-fine grains commonly used in drugs and processed foods, scientists said on Sunday.