d2jsp
Log InRegister
d2jsp Forums > Off-Topic > Computers & IT > Programming & Development > C++ Recursion
12Next
Add Reply New Topic New Poll
Member
Posts: 1,358
Joined: Dec 30 2012
Gold: 0.10
Jun 16 2014 10:07pm
I have a class with a vector of objects that are of the same class. i.e:

Code
class CTag {
public:
CTag(const std::string name): m_name(name){}
~CTag(){}

std::string m_name;
std::vector<CTag*> m_children;
};


Basically what I'm doing is parsing html and creating a tree of ctag objects.

Code
<div class="main">
<h1><a href="/">h1 text</a></h1>
<p>
paragraph text
<a href="">url text</a>
<a href="">url text</a>
</p>
</div>


Here's a simple representation of what it'd look like after being parsed into CTags.

Code
div
|
V
m_children
| |
V V
h1 p
| |
V V
m_children m_children
| | |
V V V
a a a


I'm trying to create a recursive function that will print out the tags in the same fashion as the html above but can't seem to wrap my head around it >.<

This post was edited by SelfTaught on Jun 16 2014 10:13pm
Member
Posts: 13,425
Joined: Sep 29 2007
Gold: 0.00
Warn: 20%
Jun 16 2014 10:22pm
I don't think recursion is the way to solve this. If someone passes a long enough html log, you risk the chance of causing a segfault. I think iteration would be a more practical choice.

Although why are you reinventing the wheel. There should be plenty of libraries that do this already.
Member
Posts: 1,358
Joined: Dec 30 2012
Gold: 0.10
Jun 16 2014 10:44pm
Quote (AbDuCt @ Jun 16 2014 08:22pm)
I don't think recursion is the way to solve this. If someone passes a long enough html log, you risk the chance of causing a segfault. I think iteration would be a more practical choice.

Although why are you reinventing the wheel. There should be plenty of libraries that do this already.


Ah yeah, good point. Could you give an example of how you'd iterate over this? Even pseudo code would be satisfactory.

I'm trying to create an additional layer of abstraction so I can parse specific data from sites, in specific chunks of html on pages. I haven't been able to find a library which allows me to achieve that without using regular expressions. And... I'm trying to avoid using regular expressions cus I've read it's not a recommend / reliable solution for parsing data from html. There are plenty of libraries for iterating over html / xml though which I'm using.

This post was edited by SelfTaught on Jun 16 2014 10:46pm
Member
Posts: 13,425
Joined: Sep 29 2007
Gold: 0.00
Warn: 20%
Jun 16 2014 10:56pm
I might write something up tomorrow if you're still looking, I'm a bit tired tonight. Although try to look for a framework like something like http://nokogiri.org/

It allows you to import an entire html file, and dissect it via methods and such. In the languages nokogiri supports it even has enumeration on html entities so you can iterate over each link in the page (links are objects for example).
Member
Posts: 9,803
Joined: Jun 28 2005
Gold: 6.67
Jun 17 2014 03:31am
How can you go deeper iteratively?
Member
Posts: 32,925
Joined: Jul 23 2006
Gold: 3,804.50
Jun 17 2014 06:56am
Quote (KrzaQ2 @ Jun 17 2014 05:31am)
How can you go deeper iteratively?


afaik, anything that can be done recursively can be done iteratively.
Member
Posts: 1,358
Joined: Dec 30 2012
Gold: 0.10
Jun 17 2014 09:16am
Quote (carteblanche @ Jun 17 2014 04:56am)
afaik, anything that can be done recursively can be done iteratively.


Yeah I think theyre both the same but conceptually different.

This post was edited by SelfTaught on Jun 17 2014 09:21am
Member
Posts: 4,372
Joined: Jul 2 2009
Gold: 3,630.00
Jun 17 2014 10:05am
Quote (KrzaQ2 @ 17 Jun 2014 11:31)
How can you go deeper iteratively?


You can use a stack to store the nodes that need to be processed.

This post was edited by Futu_Re on Jun 17 2014 10:06am
Member
Posts: 13,425
Joined: Sep 29 2007
Gold: 0.00
Warn: 20%
Jun 17 2014 12:32pm
Quote (KrzaQ2 @ Jun 17 2014 05:31am)
How can you go deeper iteratively?


Could iterate storing details each complete iteration eventually building a tree couldn't you?

Something like

Quote
<html>
  <head>
      <title>test</title>
  <head>

  <body>
      <p>paragraph</p>
      <h1>header</h1>
  </body>
</html>


First complete iteration will find all nodes below HTML (head, body) stored to a list or array for processing until all the current levels nodes have been found.

so tree would look like

Quote

                  HTML                    [head, body]
                  /      \
          head          body


Next you would iterate again but find the next depth to iterate which would be head which was the first to be found, and stored in the list/array. It would first look for new nodes, if any were to be found create an array/list of them to be further searched, if not store the data contained in that node.


Quote

                  HTML                    [head[title], body]
                  /      \
          head          body
            |                |
          title
          /
      "test"


Then you would do this again but this time it would result in searching title for new nodes, but none will be found so its data "Test" would be added.

Then you can remove the head node from your list/array because you completely searched it and process your next node in the array/list body, which it would find two nodes to process while iterating. From there it would process again and will either search for more nodes or their data.


Quote

                  HTML
                  /      \
          head          body                    [body[h1, p]]
            |              |      \
          title
          /
      "test"


End result would be something like this

Quote

                  HTML
                  /      \
          head          body
            |              |      \
          title          h1      p
          /              |            \
      "test"      "header"    "paragraph"


Is this idea any better than recursion hell if I know. I just dislike recursion because I'm not bright enough to use it for complex needs.
Member
Posts: 1,358
Joined: Dec 30 2012
Gold: 0.10
Jun 17 2014 03:03pm
Lol, yeah I don't fully understand which would be more beneficial either. But iteration makes sense and works for me though so I'll go with it. Thanks for taking the time to explain that.
Go Back To Programming & Development Topic List
12Next
Add Reply New Topic New Poll