Correspondence Pattern Attribute Selection for Consumption of Federated Data Sources

O'SULLIVAN, DECLAN; BRENNAN, ROB; WALSHE, BRIAN

dc.contributor.author	O'SULLIVAN, DECLAN	en
dc.contributor.author	BRENNAN, ROB	en
dc.contributor.author	WALSHE, BRIAN	en
dc.date.accessioned	2013-08-12T14:00:01Z
dc.date.available	2013-08-12T14:00:01Z
dc.date.created	April 16-20, 2012	en
dc.date.issued	2012	en
dc.date.submitted	2012	en
dc.identifier.citation	Brian Walshe, Rob Brennan, Declan O'Sullivan, Correspondence Pattern Attribute Selection for Consumption of Federated Data Sources, Distributed Autonomous Network Management Systems/Network Operation and Management Symposium, Maui, Hawaii,. USA, April 16-20, 2012, 2012, 1234 - 1240	en
dc.identifier.other	Y	en
dc.identifier.uri	http://hdl.handle.net/2262/66995
dc.description	PUBLISHED	en
dc.description	Maui, Hawaii,. USA	en
dc.description.abstract	When consuming data from federated domains, it is often necessary to identify the relationships that exist between the data schemas used in each domain. Discovering the exact nature of these relationships is difficult due to data set schema heterogeneity. Prior work has focused on inter-domain class equivalence. However it is not always possible to find an equivalent class in both schemas. For example, when instances are modeled as classes in one domain (e.g. router type) but as the attribute values of a single class in the other domain (e.g. router interface). This paper investigates whether when classifying instances in one data set against a second schema, it may be more useful to use some attribute (or attribute group) other than the original class type, to perform this classification. A machine- learning based classification approach to appropriate attribute selection is presented and its operation is evaluated using two large data-sets available on the web as Linked Data. The classification problem is compounded by the less formal semantics of Linked Data when compared to full ontologies but this also highlights the strength of our approach to dealing with noisy or under-specified data-sets and schemas. The experimental results show that our attribute selection approach is capable of discovering appropriate mappings for cases where the correspondence is conditioned on one attribute and that information gain provides a suitable scoring function for selection of correspondence patterns to describe these complex attribute- based mappings.	en
dc.description.sponsorship	Science Foundation Ireland FAME Strategic Research Cluster (award No. 08/SRC/I1408)	en
dc.format.extent	1234	en
dc.format.extent	1240	en
dc.language.iso	en	en
dc.rights	Y	en
dc.subject.other	Data schemas
dc.title	Correspondence Pattern Attribute Selection for Consumption of Federated Data Sources	en
dc.title.alternative	Distributed Autonomous Network Management Systems/Network Operation and Management Symposium	en
dc.type	Conference Paper	en
dc.type.supercollection	scholarly_publications	en
dc.type.supercollection	refereed_publications	en
dc.identifier.peoplefinderurl	http://people.tcd.ie/walshebr	en
dc.identifier.peoplefinderurl	http://people.tcd.ie/osulldps	en
dc.identifier.peoplefinderurl	http://people.tcd.ie/rbrenna	en
dc.identifier.rssinternalid	82487	en
dc.contributor.sponsor	Science Foundation Ireland (SFI)	en
dc.contributor.sponsorGrantNumber	08/SRC/I1408	en

Files in this item

Name:: AttributeSelection.pdf
Size:: 1.076Mb
Format:: PDF
Description:: Accepted for publication (author's ...

View/Open

Name:: license.txt
Size:: 3.243Kb
Format:: Text file

View/Open

This item appears in the following Collection(s)

Computer Science (Scholarly Publications)
Computer Science (Scholarly Publications)
RSS Feeds

Show simple item record

Browse

My Account

Correspondence Pattern Attribute Selection for Consumption of Federated Data Sources

Files in this item

This item appears in the following Collection(s)