XML encoding

Posted: 6 Mar, 2021
Difficulty: Moderate


Try Problem

The eXtensible Markup Language (XML) is a simple text-based format for representing information. The syntax for XML is as follows:

  1. Element => TAG Attributes END Child END
  2. Attribute => TAG Value

  • Value: String value.
  • TAG: Default mapping to an integer value.
  • Child: List of ‘Elements’ or a single ‘Value’. Never both.
  • End: Encoded as 0.

You are given an ‘Element’ object and the mapping of each ‘TAG’ to an integer value. The task is to encode this ‘Element’ using the ‘TAG’ mapping and return the resultant string.

Element object:
<codingNinjas course=“InterviewPreparation” type=“IndividualCourse”>
    <topic name=“Aptitude”>CourseContents</topic>

Tag mapping:
‘codingNinjas’ => 1, ‘topic’ => 2, ‘courseName’ => 3, ‘type’ => 4, ‘topicName’ => 5

The given element is ‘codingNinjas’, and it has a single child element ‘topic’. The element ‘codingNinjas’ has two attributes having tags as ‘course’ and ‘type’. The element ‘topic’ has one attribute with the tag as ‘name’. So, the encoded string is: 

‘1 3 InterviewPreparation 4 IndividualCourse 0 2 5 Aptitude 0 CourseContents 0 0’
1. ‘END’ is always encoded as 0.
2. An ‘Element’ can have zero or multiple ‘Attributes’.
Function description:
The function ‘encodeXML’ has the following parameters:
Pointer to an ‘Element’ object.
Mapping of ‘TAG’ name (string) to an integer value.
Structure of ‘Element’ and ‘Attribute’ object:
Element {
    String tag;
    List<Attribute> attributes;
    List<Element *> child;
    String value;
Attribute {
    String tag;
    String value;

If ‘Element’ stores a list of child elements, then ‘(Element => values)’ stores an empty string.
Input format:
The first line of input contains an integer ‘T’ denoting the number of test cases.

The first line of each test case contains an integer ‘N’ denoting the count of elements, including nested elements.

The following ‘N’ blocks of lines contain the description of ‘N’ elements. Each block is as follows:

 1. Each block’s first line contains three space-separated values, integer ‘ID’ denoting the element’s id, ‘M’ denoting the number of ‘Attributes’, and a string denoting the element’s ‘tag’. The ‘ID’ is used later on to identify the child elements.
 2. The next ‘M’ lines contain two space-separated integers denoting the ‘tag’ and ‘value’ of each ‘Attribute’.
 3. The next line contains either of the following:
     a. If the element stores a value, it contains the integer 0 and a string denoting the ‘value’ in the element, separated by space.
     b. Otherwise, an integer ‘C’ denoting the number of child elements.

The next ‘N - 1’ lines contain two space-separated integers, ‘A’ and ‘B’, the element with ‘id’ as ‘B’ is the child of the element with ‘id’ as ‘A’. The given ‘Element’ object always has ‘id’ equal to ‘1’ and has no parent.

The next line of each test case contains an integer ‘S’ denoting the number of ‘tags’.

The next ‘S’ lines of each test case contain two space-separated values, a string denoting the ‘tag’ name and an integer mapped to this ‘tag’.
Output format:
For each test case, return the encoded string.
You do not need to print anything; it has already been taken care of. Just implement the function.
1 <= T <= 10
2 <= N <= 1000

The count of ‘Attributes’ of all elements doesn’t exceed 1000.
The string ‘Value’ is non-empty and contains only ‘a-z’ and ‘A-Z’ characters.

Time Limit: 1 sec
Approach 1

To encode an element, we need to encode its 'TAG' and ‘attributes’. If it doesn’t store a ‘VALUE’, we need to encode its children in the given order. For encoding the children, we first encode their 'TAG' and ‘attributes’, then encode their children, and so on. We stop this process when we reach an element that stores a ‘VALUE’. So, we need to encode the list of child elements recursively. 


Create a helper function ‘encodeHelper’ that takes parameters as an ‘element’ object, 'TAG' to integer mapping, and references to a string. The helper function encodes the passed element and appends it to the ‘RESULT’. After that, it traverses through the list of child elements and makes recursive calls to itself with the child element as a parameter. In the recursive call, it appends the encoded child element to ‘RESULT’. The recursion ends when an element that stores a ‘VALUE’ is reached.


The helper function, ‘encodeHelper(Object ‘ELEMENT’, Map 'TAG_TO_INT', String& ‘RESULT’)’:


  • Append ‘RESULT’ with the integer mapped to the 'TAG' of ‘ELEMENT’.
  • Run a loop where ‘i’ iterates through the list of attributes in ‘ELEMENT’:
    • Append ‘RESULT’ with the integer mapped to the 'TAG' of attribute ‘i’.
    • Append ‘RESULT’ with the ‘VALUE’ stored in the attribute ‘i’.
  • If the ‘VALUE’ in the ‘ELEMENT’ is not ‘NULL’, then append ‘RESULT’ with ‘VALUE’.
  • Else, run a loop where ‘i’ iterates through the list of children in ‘ELEMENT’:
    • Call the function ‘encodeHelper’ with parameters (‘i’, 'TAG_TO_INT', ‘RESULT’).


In the ‘encodeXML’ function, we call ‘encodeHelper’ with parameters: (object of input ‘ELEMENT’, given 'TAG' to integer mapping, an empty string ‘RESULT’). The string ‘RESULT’ stores the converted output string.


  • Initialize ‘RESULT’ as an empty string.
  • Call function ‘encodeHelper’ with parameters (object of input ‘ELEMENT’, given 'TAG' to integer mapping, ‘RESULT’).
  • Return ‘RESULT’ as the answer.
Try Problem