XBRL地区组织官方网站

关注我们：

︱

2023年6月6日

English version

　新闻动态

　其他国家、地区和多边机制

　IASB

　XBRL国际组织

　港澳台

　中国内地

xbrl > 新闻动态 > 其他国家、地区和多边机制 >

XBRL跨越语言鸿沟

2010-05-19

来源：Semantic Web

编辑：

浏览量:

XBRL Across the Language Divide

XBRL is good. It could be better, at least when it comes to being able to leverage the standard used in business reporting for presenting and retrieving financial report data across a range of languages.

That may be one of the areas that benefits from the help of Europe’s MONNET Project (Multilingual Ontologies for Networked Knowledge), a research effort for integrating information access across different languages. The Project can have a big impact on businesses operating in a global, networked, multinational world. Whether it’s an organization contemplating doing business with another company in a different country, which will need access to the latter’s information in a foreign tongue, or for modeling web services across language barriers, a multilingual Semantic Web can make the work easier.

Today some companies, including SAP, offer language-specific tools for text analytics, but The MONNET Project will try to build software that breaks the link between conceptual information and linguistic expressions (the labels that point back to concepts in ontologies) for each language. That should make it easier and quicker to perform such analytics across multiple languages. For each language, you will have a dictionary that points to the concepts. Analytics of information in documents then can be more independent of the language in which they were written, and it becomes easier to add new languages to lexicons and automatically generate extraction rules to draw out facts from text.

Back to considering the need for cross-lingual business intelligence on companies and services, says Susan Thomas, product manager in the Germany research department of SAP, which is responsible for the use case analysis for the MONNET Project. There is a project underway in Europe around XBRL (eXetensible Business Reporting Language) to drive adoption of the business reporting standard, and across the world more countries are demanding its use for filing financial information. “XBRL itself is meant to be language-independent, and for each language you want to use it in, you can specify the different labels for each concept,” Thomas says. But, "you still might have a language problem. When someone files a financial report—SAP, for instance, files it in English in the U.S. and in German in Germany, and maybe a company that is only based in France files only in French—you still have the problem of multilingual access to the data.”

A semantics-based solution for accessing such information across language barriers will make it easier for users to search and query financial information in their own language, extract data (including information in free text fields), and create reports that present the search and query results in a chosen language, as well.

Insights So Far

Underlying all the use cases -- including the use case for web services -- is the same technology, she says. “In MONNET we will just use simple and partial descriptions of business services for the underlying conceptual model, and the idea is you can browse and search for them in your language. Even though they were described in a different language you will still be able to find them.” Requirements gathering for the use cases began in March, and is due to wrap within six months, after which prototypes will be built and revisions made based upon requests.

But the requirements gathering stage already is bearing some interesting fruit. It’s become apparent, for example, that companies have spent a lot of effort building terminology databases. How to leverage those in the project will be worth exploring, and it’s an area where academic and business points of view may be a little off-track from each other. Academic partners from DERI, DFKI, and Polytechnical University of Madrid (hosts of the recent Multilingual Semantic Web Conference), are building the software, and naturally with their semantic web background will be thinking strongly about ontologies.

“The semantic web is all about ontologiess,” says Thomas. “Most companies have invested a lot of effort in creating terminology databases that aren’t exactly ontologies, though. It’s a legacy work, you might call it.” And the whole past history of what they’ve done can’t be put aside, nor does she think it has to be. “I think they could be leveraged for doing this sort of text analytics. I think that is one thing to look into, to build upon existing things that companies have, and not just assume this is a green field because in reality it is a brown field.”

It’s also likely that the translation aspect of the project will be semi-automatic, not fully automatic. “The idea is that you have a conceptual model and then, for example, a lexicon in English and now you want to create a lexicon in German. Ideally you could do that automatically but really you need someone with expertise to look at it, and then the issue is what kind of skills do people need to do that translation?” she says. The tools can make suggestions for translations and rank them so that the most likely correct translation appears at the top of a list, with human oversight on whether it’s correct.

There seems plenty of time to work out some of the issues, as the project doesn’t conclude until 2013. At the end of the effort, SAP will also assess whether it wants to build some of the ideas and technology into its products. Says Thomas, “That’s an ongoing activity as part of research, assessing how it should impact our future products.”

The MONNET project has received EUR 2.4 million under the 'Information and communication technologies' Theme of the Seventh Framework Programme (FP7).