Package edu.upf.taln.dri.lib.model.graph
Class DependencyGraph
- java.lang.Object
-
- edu.upf.taln.dri.lib.model.graph.DependencyGraph
-
public class DependencyGraph extends Object
Dependency Graph data structure and methods
-
-
Constructor Summary
Constructors Constructor Description DependencyGraph()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description Integer
addEdge(String name, Integer from, Integer to, boolean isSRL)
Add edgeBoolean
addHeadWord(Integer nodeId, String headWord, boolean deletePreexisting)
Add a head word to the set of head words associated to the node.Integer
addNode(Integer nodeId, String word, String pos, String lemma, Set<Integer> corefIDs, String corefName, Integer sentOrder)
Add a new nodeBoolean
addRhetoricalClass(Integer nodeId, String rhetoricalClass, boolean deletePreexisting)
Add a rhetorical class to the set of rhetorical classes associated to the nodeboolean
addToMergedIDmap(Integer nodeId, Integer mergedNodeId, String mergedNodeName, boolean emptyMap)
Add an element to the node_mergedIDmap of the nodeboolean
addToMergedNameMap(Integer nodeId, Integer mergedNodeId, String mergedNodeName, boolean emptyMap)
Add an element to the node_mergedNameMap of the nodeboolean
changeEdgeName(Integer edgeId, String newEdgeName)
Change the name of the edge - new name not null or emptyboolean
changeNodeName(Integer nodeId, String newNodeName)
Change the name of the node - new name not null or emptyInteger
compactNodes()
Tentative version of node collapsing heuristics over a dependency graphint
deleteEdgesByNameRegExp(List<String> regexpList, boolean deleteMatching)
Delete all the edges with name that matches one of the reg expsMap<Integer,org.apache.commons.lang3.tuple.Pair<Integer,Integer>>
getAllEdges()
Get all edges of the dependency graphList<Integer>
getAllOrderedDepthFirstEdges()
Get all edges of the dependency graph ordered by depth first visitMap<Integer,Integer>
getCausalCauseIDmap(Integer nodeId)
Get the map of causal relations the nodes is part of.Map<Integer,Integer>
getCausalEffectIDmap(Integer nodeId)
Get the map of causal relations the nodes is part of.Map<Integer,String>
getCausalRoleNameMap(Integer nodeId)
Get the map of causal relations the nodes is part of.Set<Integer>
getChildrenNodes(Integer nodeId)
The id(s) of the children node(s)Integer
getEdgeFromNode(Integer edgeId)
Get the start node of the edgeString
getEdgeName(Integer edgeId)
Get edge nameSet<Integer>
getEdgesByNameRegExp(String nameRegExp)
Get the id of the edges by regexp match on edge nameSet<Integer>
getEdgesByNameSourceAndDestination(Integer sourceId, Integer destinationId, String name)
Get edges by source node id / destination node id / name At least on of source node Id / destination node Id / name has not to be blank / empty.String
getEdgeSRLsensefeat(Integer edgeId)
Get the SRL sense of the edgeInteger
getEdgeSRLsentId(Integer edgeId)
Get the SRL sent ID of the edgeString
getEdgeSRLtag(Integer edgeId)
Get the SRL tag of the edgeInteger
getEdgeToNode(Integer edgeId)
Get the end node of the edgeSet<String>
getHeadWordsSet(Integer nodeId)
Get the set of head words associated to the nodeMap<Integer,org.apache.commons.lang3.tuple.Pair<Integer,Integer>>
getIncidentEdges(Integer nodeId, String name)
Get the set of edges entering in the node.Map<Integer,String>
getMergedIDmap(Integer nodeId)
Get the map of original nodeID and lexicalization of the nodes merged in the current nodeMap<Integer,String>
getMergedNameMap(Integer nodeId)
Get the map of sentence order ID of node and lexicalization of the sentence nodes merged in the current nodeSet<Integer>
getNodeCorefID(Integer nodeId)
The IDs of the coreference chains in which the node is the head of one of their elements.String
getNodeCorefName(Integer nodeId)
Get co-reference node name (aggregates all the tokens of the coreference chain element).int
getNodeCount()
Number of nodes in the graphString
getNodeLemma(Integer nodeId)
Get node POSString
getNodeName(Integer nodeId)
Get node nameString
getNodePOS(Integer nodeId)
Get node POSSet<Integer>
getNodesByNameRegExp(String nameRegExp)
Get the id of the nodes by regexp match on node nameInteger
getNodeSentOrder(Integer nodeId)
Get node sentence orderMap<Integer,org.apache.commons.lang3.tuple.Pair<Integer,Integer>>
getOutgoingEdges(Integer nodeId, String name)
Get the set of edges going out from the node.Set<Integer>
getParentNodeIds(Integer nodeId)
The id(s) of the parent node(s)Set<String>
getRhetoricalClassSet(Integer nodeId)
Get the set of rhetorical classes associated to the nodeSet<Integer>
getRootNodeIds()
Get IDs of root nodesorg.apache.commons.lang3.tuple.Pair<Integer,List<String>>
getSentenceIDTokensPair(Integer nodeId)
Get the pair of sentence ID and list of Tokens of a specific node (list of sentence words separated by space)Map<Integer,Set<Integer>>
getSRLframeParticipantTags(Integer rootNodeId)
Get a map of predicates the node specified by ID is root of.Map<Integer,Set<String>>
getSRLframeRoots()
Get a map of node ID (key) / set of sense by considering only nodes that are roots of SRL frames The set of senses is the set of different frame the node participates in as root.String
graphAsString(GraphToStringENUM outputType)
Get a string representation of the graphstatic void
main(String[] args)
boolean
mergeNodes(Integer nodeId1, Integer nodeId2, String newNodeName)
Merge the second node with the first.void
sanitizeGraph()
Check and remove self loops edges and merges duplicated / non-SRL edgesBoolean
setEdgeSRLsenseAndTag(Integer edgeId, String tag, String sense, Integer sentenceId)
Set an SRL edge sense and tag and source sentence IDBoolean
setNodeInCrossSentCausalRel(Integer nodeId, Integer causalRelId, String roleName, Integer causeNode, Integer effectNode)
Set a node as participating to a cross-sentence causal relation (the cause annotated in a sentence and the effect in another one) A node annotated with these properties can be the cause or the effect of the causal relation.Boolean
setSentenceIDTokensPair(Integer nodeId, Integer sentenceID, List<String> sentenceTokenList)
Set the pair of sentence ID and list of Tokens for a specific node (list of sentence words separated by space)
-
-
-
Method Detail
-
addNode
public Integer addNode(Integer nodeId, String word, String pos, String lemma, Set<Integer> corefIDs, String corefName, Integer sentOrder)
Add a new node- Parameters:
nodeId
-word
-pos
-lemma
-- Returns:
-
changeNodeName
public boolean changeNodeName(Integer nodeId, String newNodeName)
Change the name of the node - new name not null or empty- Parameters:
nodeId
-newNodeName
-- Returns:
-
setEdgeSRLsenseAndTag
public Boolean setEdgeSRLsenseAndTag(Integer edgeId, String tag, String sense, Integer sentenceId)
Set an SRL edge sense and tag and source sentence ID- Parameters:
edgeId
-tag
-sense
-sentenceId
-- Returns:
-
setNodeInCrossSentCausalRel
public Boolean setNodeInCrossSentCausalRel(Integer nodeId, Integer causalRelId, String roleName, Integer causeNode, Integer effectNode)
Set a node as participating to a cross-sentence causal relation (the cause annotated in a sentence and the effect in another one) A node annotated with these properties can be the cause or the effect of the causal relation.- Parameters:
nodeId
- the ID of the node that participates in the causal relationcausalRelId
- the ID of the causal relationroleName
- the name of the role of the node in the causal relationcauseNode
- the id of the node that identifies the cause of the causal relationeffectNode
- the id of the node that identifies the effect in the causal relation- Returns:
-
addRhetoricalClass
public Boolean addRhetoricalClass(Integer nodeId, String rhetoricalClass, boolean deletePreexisting)
Add a rhetorical class to the set of rhetorical classes associated to the node- Parameters:
nodeId
-rhetoricalClass
- if null or empty, no rhetorical class is added (to empty the rhetorical class set, set to null and deletePreexisting to true)deletePreexisting
- if true, all pre-existing rhetorical classes are deleted- Returns:
-
getRhetoricalClassSet
public Set<String> getRhetoricalClassSet(Integer nodeId)
Get the set of rhetorical classes associated to the node- Parameters:
nodeId
-- Returns:
-
addHeadWord
public Boolean addHeadWord(Integer nodeId, String headWord, boolean deletePreexisting)
Add a head word to the set of head words associated to the node. Usually, each node not derived from the merging of coreferents has only a head word.- Parameters:
nodeId
-headWord
- if null or empty, no head word is added (to empty the head words set, set to null and deletePreexisting to true)deletePreexisting
- if true, all pre-existing head words are deleted- Returns:
-
getHeadWordsSet
public Set<String> getHeadWordsSet(Integer nodeId)
Get the set of head words associated to the node- Parameters:
nodeId
-- Returns:
-
setSentenceIDTokensPair
public Boolean setSentenceIDTokensPair(Integer nodeId, Integer sentenceID, List<String> sentenceTokenList)
Set the pair of sentence ID and list of Tokens for a specific node (list of sentence words separated by space)- Parameters:
nodeId
-sentenceID
-sentenceTokenList
-- Returns:
-
getSentenceIDTokensPair
public org.apache.commons.lang3.tuple.Pair<Integer,List<String>> getSentenceIDTokensPair(Integer nodeId)
Get the pair of sentence ID and list of Tokens of a specific node (list of sentence words separated by space)- Parameters:
nodeId
-- Returns:
-
addEdge
public Integer addEdge(String name, Integer from, Integer to, boolean isSRL)
Add edge- Parameters:
name
-from
-to
-isSRL
-- Returns:
-
changeEdgeName
public boolean changeEdgeName(Integer edgeId, String newEdgeName)
Change the name of the edge - new name not null or empty- Parameters:
edgeId
-newEdgeName
-- Returns:
-
getAllOrderedDepthFirstEdges
public List<Integer> getAllOrderedDepthFirstEdges()
Get all edges of the dependency graph ordered by depth first visit- Parameters:
edgeId
- ordered list of edge ids- Returns:
-
getAllEdges
public Map<Integer,org.apache.commons.lang3.tuple.Pair<Integer,Integer>> getAllEdges()
Get all edges of the dependency graph- Parameters:
edgeId
-- Returns:
-
getEdgeFromNode
public Integer getEdgeFromNode(Integer edgeId)
Get the start node of the edge- Parameters:
edgeId
-- Returns:
-
getEdgeToNode
public Integer getEdgeToNode(Integer edgeId)
Get the end node of the edge- Parameters:
edgeId
-- Returns:
-
getEdgeSRLtag
public String getEdgeSRLtag(Integer edgeId)
Get the SRL tag of the edge- Parameters:
edgeId
-- Returns:
-
getEdgeSRLsensefeat
public String getEdgeSRLsensefeat(Integer edgeId)
Get the SRL sense of the edge- Parameters:
edgeId
-- Returns:
-
getEdgeSRLsentId
public Integer getEdgeSRLsentId(Integer edgeId)
Get the SRL sent ID of the edge- Parameters:
edgeId
-- Returns:
-
getNodeCorefID
public Set<Integer> getNodeCorefID(Integer nodeId)
The IDs of the coreference chains in which the node is the head of one of their elements. Usually it should be a one element List since each node can be head of one element of a specific coreference chain.- Parameters:
nodeId
-- Returns:
-
getNodeCorefName
public String getNodeCorefName(Integer nodeId)
Get co-reference node name (aggregates all the tokens of the coreference chain element).- Parameters:
nodeId
-- Returns:
-
getNodeSentOrder
public Integer getNodeSentOrder(Integer nodeId)
Get node sentence order- Parameters:
nodeId
-- Returns:
-
getNodeCount
public int getNodeCount()
Number of nodes in the graph- Returns:
-
getEdgesByNameSourceAndDestination
public Set<Integer> getEdgesByNameSourceAndDestination(Integer sourceId, Integer destinationId, String name)
Get edges by source node id / destination node id / name At least on of source node Id / destination node Id / name has not to be blank / empty. (AND OF source node id, destination node id and name checks)- Parameters:
edgeList
- list of edges to filtersourceId
-destinationId
-name
-- Returns:
- filtered edge ids list
-
getEdgesByNameRegExp
public Set<Integer> getEdgesByNameRegExp(String nameRegExp)
Get the id of the edges by regexp match on edge name- Parameters:
nameRegExp
-- Returns:
-
getNodesByNameRegExp
public Set<Integer> getNodesByNameRegExp(String nameRegExp)
Get the id of the nodes by regexp match on node name- Parameters:
nameRegExp
-- Returns:
-
getParentNodeIds
public Set<Integer> getParentNodeIds(Integer nodeId)
The id(s) of the parent node(s)- Parameters:
nodeId
-- Returns:
-
getChildrenNodes
public Set<Integer> getChildrenNodes(Integer nodeId)
The id(s) of the children node(s)- Parameters:
nodeId
-- Returns:
-
getIncidentEdges
public Map<Integer,org.apache.commons.lang3.tuple.Pair<Integer,Integer>> getIncidentEdges(Integer nodeId, String name)
Get the set of edges entering in the node. If name is not null or empty, only edges with that name are returned- Parameters:
nodeId
-name
-- Returns:
-
getOutgoingEdges
public Map<Integer,org.apache.commons.lang3.tuple.Pair<Integer,Integer>> getOutgoingEdges(Integer nodeId, String name)
Get the set of edges going out from the node. If name is not null or empty, only edges with that name are returned- Parameters:
nodeId
-name
-- Returns:
-
getSRLframeRoots
public Map<Integer,Set<String>> getSRLframeRoots()
Get a map of node ID (key) / set of sense by considering only nodes that are roots of SRL frames The set of senses is the set of different frame the node participates in as root.- Returns:
-
getSRLframeParticipantTags
public Map<Integer,Set<Integer>> getSRLframeParticipantTags(Integer rootNodeId)
Get a map of predicates the node specified by ID is root of. Each element of the map has as key an unambiguous id of the SRL frame (ID of the sentence that contains the SRL frame) and as value the set of ids of the edges participating in the frame- Parameters:
rootNodeId
-- Returns:
-
graphAsString
public String graphAsString(GraphToStringENUM outputType)
Get a string representation of the graph- Parameters:
formatType
-- Returns:
-
deleteEdgesByNameRegExp
public int deleteEdgesByNameRegExp(List<String> regexpList, boolean deleteMatching)
Delete all the edges with name that matches one of the reg exps- Parameters:
regexpList
-deleteMatching
- if false delete the edges that doesn't match- Returns:
-
mergeNodes
public boolean mergeNodes(Integer nodeId1, Integer nodeId2, String newNodeName)
Merge the second node with the first. The properties of the first node are preserved. In-going and outgoing edges of the second node are moved to the first one. NB: the new name of a merged pair of nodes is consistent only if both nodes are in the same sentence. Since the sentence specific order is exploited to determine the order of the names of the merged nodes.- Parameters:
nodeId1
-nodeId2
-newNodeName
- if not null, the node 1 will be renamed- Returns:
-
getMergedIDmap
public Map<Integer,String> getMergedIDmap(Integer nodeId)
Get the map of original nodeID and lexicalization of the nodes merged in the current node- Parameters:
nodeId
-- Returns:
-
addToMergedNameMap
public boolean addToMergedNameMap(Integer nodeId, Integer mergedNodeId, String mergedNodeName, boolean emptyMap)
Add an element to the node_mergedNameMap of the node- Parameters:
nodeId
-nodeName
-emptyMap
-- Returns:
-
addToMergedIDmap
public boolean addToMergedIDmap(Integer nodeId, Integer mergedNodeId, String mergedNodeName, boolean emptyMap)
Add an element to the node_mergedIDmap of the node- Parameters:
nodeId
-nodeName
-emptyMap
-- Returns:
-
getMergedNameMap
public Map<Integer,String> getMergedNameMap(Integer nodeId)
Get the map of sentence order ID of node and lexicalization of the sentence nodes merged in the current node- Parameters:
nodeId
-- Returns:
-
getCausalRoleNameMap
public Map<Integer,String> getCausalRoleNameMap(Integer nodeId)
Get the map of causal relations the nodes is part of. The keys are the ID of each causal relation, while the values are the role of the node in each causal relation (CAUSE or EFFECT).- Parameters:
nodeId
-- Returns:
-
getCausalCauseIDmap
public Map<Integer,Integer> getCausalCauseIDmap(Integer nodeId)
Get the map of causal relations the nodes is part of. The keys are the ID of each causal relation, while the values are the ID of the cause node in each causal relation.- Parameters:
nodeId
-- Returns:
-
getCausalEffectIDmap
public Map<Integer,Integer> getCausalEffectIDmap(Integer nodeId)
Get the map of causal relations the nodes is part of. The keys are the ID of each causal relation, while the values are the ID of the cause node in each causal relation.- Parameters:
nodeId
-- Returns:
-
sanitizeGraph
public void sanitizeGraph()
Check and remove self loops edges and merges duplicated / non-SRL edges
-
compactNodes
public Integer compactNodes()
Tentative version of node collapsing heuristics over a dependency graph- Returns:
-
main
public static void main(String[] args)
-
-