Been pondering over this some more.
Even if I use entire sentences, there is still no subject awareness to it. For example, you could be talking about repairing a computer and then respond to the AI with a generic sentence like "that may be difficult." When this happens if there is no type of subject segregation and that generic sentence can lead into other subjects rather than still being about computers.
The way I initially tried to solve this was to map an input sentence to an output markov chain list so that each input sentence could generate it's own output regarding that input. This still has the problem of the "that may be difficult." input potentially being linked to a list of chains that have no awareness of the original subject.
So my next idea might be to play around with a text-rank and page-rank algorithm kind of how google ranks their search results based on user input. For example the sentence "These computers are all destroyed and will have to be repaired." has a ranked keyword list of:
Code
[["repaired", 5.198454021461245], ["destroyed", 4.72890101697455], ["computers", 4.259348012487856]]
From there I could devise a way to generate responses based on these keywords. I could potentially take each unique keyword and maybe link it to a chain list.
This still poses a problem of generic sentences triggering a topic change, although I think if I implement some sort of short term memory where I store the last complex word keywords like above I can attempt to retain the topic from changing. This could possibly be done by comparing the length of the words and their weights that the ranking gave them.
For example if a user responded with "That may be terribly difficult" to the bots response, the ranking system would output
Code
[["terribly", 3.779489856987712], ["difficult", 3.779489856987712]]
From there I could either keep the original keywords, mangle these new keywords into the memory, or overwrite memory with these new keywords. I haven't decided the logic behind of when to wipe the memory, retain the memory, or mix the memory yet.
-----------------------------------------------
This topic is basically just me spouting random shit now, pay no concern.