Wiki Article
Talk:Trie
Nguồn dữ liệu từ Wikipedia, hiển thị bởi DefZone.Net
| Trie has been listed as one of the Engineering and technology good articles under the good article criteria. If you can improve it further, please do so. If it no longer meets these criteria, you can reassess it. Review: April 18, 2022. (Reviewed version). |
| This article is rated GA-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||
| ||||||||||||||||||
| The content of B-trie was merged into Trie. The former page's history now serves to provide attribution for that content in the latter page, and it must not be deleted as long as the latter page exists. For the discussion at that location, see its talk page. |
Erroneous Python example
[edit]The Python example seems to be wrong as it lacks a recursive "find". — Preceding unsigned comment added by 178.2.130.142 (talk) 10:12, 15 February 2014 (UTC)
- I don't see any erroneous-ness: the example does not purport to use recursion, and it does find
keywhenkeyis innode. There was an issue withinsertnot functioning properly because the condition to build new nodes was predicated on a non-zero last-index for the string to be inserted, but I just fixed that. Zacharysyoung (talk) 06:00, 11 May 2018 (UTC)
Disadvantages
[edit]I think there should be a section describing the disadvantages of tries. Currently, there are a few disadvantages listed under the "comparison with other data structures" section.
And one disadvantage that I don't see mentioned anywhere is the space/time tradeoff of managing child nodes: either you need a (potentially large and sparse) array of indexes at each node, or you need to implement a secondary search algorithm to find the appropriate child(eg, the Python example uses a hash lookup over a dictionary of children). While the indexed approach may not be an issue for relatively shallow tries using ASCII (ie, the example at the top of the page), it will quickly become an issue with deep tries and/or large value spaces (eg, Unicode). Perhaps someone with more practical experience can comment on this? —Preceding unsigned comment added by 64.94.198.98 (talk) 17:21, 25 May 2010 (UTC)
Misleading Diagram
[edit]A better diagram for the page would show a trie where there are more than two edges from each node. The current diagram makes a trie seem too much like a binary search tree, and the idea that each node can have multiple edges depending on the lexicon is lost. —Preceding unsigned comment added by 134.197.40.199 (talk) 13:29, 11 February 2009 (UTC)
- This has now been fixed. me_and (talk) 09:39, 20 May 2010 (UTC)
As replacement of other data structures (hash table)
[edit]The article currently says that "The worst-case lookup speed in an imperfect hash table is O(log(N)) time." I would think that the worst case would be O(N) in the case where all keys were mapping to the same value, and the resulting linked list of length N must be traversed. Tom Hubbard 02:27, 11 September 2007 (UTC)
OK, it looks like someone fixed this -- thanks. Tom Hubbard (talk) 14:00, 28 November 2007 (UTC)
Complexity of tries
[edit]I am just wondering, Wolfskeeper, if you yet understood complexity theory and tries, or if you just go around breaking articles because you did not (yet). Perhaps we should sort this out before you make further changes, that might need to be undone as well. --ISee 10:28, 28 July 2005 (UTC)
The pseudo code for lookup looks wrong. This will never finds strings that are prefixes of other strings that are also stored in the trie. Am I mistaken?
- That's exactly what I thought. I think the other algorithm was actually wrong entirely, since it would only match true if there was a NULL node (no node at all) for the end of the word. I've made the correction of adding a "terminal" field which denotes the end of a word. —EatMyShortz 13:52, 30 April 2006 (UTC)
"Average lookup speed is theoretically the same, but a trie is faster in practice." That seems like a highly dubious claim, unsupported by evidence! In my experience the opposite is usually true (the main benefit of tries is thus that it can grow easily, and it may take less space since prefixes of keys are shared).
- The performance is highly related to the usage pattern. In a hashtable, you can retrieve a value while touching only in 1-2 cache lines on average, whereas in a trie this can be upto its depth (length of the prefix). However, a HT will almost always require these cache fetches whereas especially in a tight loop the trie will often have most or all of the lines cached (and esp for negative results). --66.7.145.210 18:27, 5 October 2006 (UTC)
- Also as I know, a hashtable lookup requires the search-key to be hashed before while the lookup in a trie can start almost instantly. Does anybody else know the Judy Arrays or Google sparsehash? I think these are an implementation of tries (at least its JudySL class) altough the term "trie" is never used to describe both of them. --195.3.81.25 10:15, 27 August 2007 (UTC)
I'm not certain-enough of this to edit the document without research, but... If you are going to take the length of keys into account when discussing the complexity of lookup with tries then you should do the same with the other methods --otherwise you are comparing apples to oranges. Normally in discussing binary trees and hash tables, it is assumed that the keys are small constant-sized objects. When discussing tries, you have to think of strings. In this case, each comparison in a binary-tree search is a O(m) string compare so the complexity of the lookup is not O(log n) but O(log n*m). Similarly, the expected lookup time for a hash table is O(m), not O(1) because forming the hash typically takes O(m) time (you can avoid looking at all characters, but that probably effects the expected lookup time). --David Gudeman —Preceding unsigned comment added by 69.181.140.228 (talk) 04:22, 14 July 2008 (UTC)
- You are correct about apples and oranges. However, your calculation of the complexity of a trie lookup is wrong. It is O(m), not O(m logn). In fact, to be more precise, it is O(m*p), where p is the number of letters in the alphabet. The number of words that are already in the trie (n) is completely irrelevant. More so than for hash tables, whose lookup complexity is not completely independent from n, due to the possibility of collisions. This cannot be captured nicely in a formula, so we simply say that hash table lookups are O(m).
- In any case, I was very surprised to see that complexity information is completely missing from the article! 83.166.177.69 (talk) 07:29, 13 September 2015 (UTC)
Diagram Numbering
[edit]In the diagram of trie on the right hand side of http://en.wikipedia.org/w/index.php?title=Trie&oldid=65534606 the elements in the trie are numbered. What do these numbers represent? I can't for the life of me work it out - could an explanation also be added to the article?
I don't think they are intended to have any meaning. The use of numbers may not have been the most effective way to make their point, which is that only nodes with valid keys store values. Notice that the "t" and "te" nodes are not marked with numbers. This is because "t" and "te" are not real English words; therefore, they cannot serve as keys to any value (this _seems_ to be the idea). I don't think readers will be confused by this in general though, because the third paragraph provides an explanation. Maybe this was added after you made your request. Danielx 08:19, 21 August 2006 (UTC)
I agree with Danielx about the meaning of the numbers (i.e. they are merely arbitrary "values" associated with the string "keys"). However I have a problem with the diagram caption. Since, as Danielx points out, "t" and "te" are not marked with numbers then they are not "keys", but just key prefixes. The caption lists them as if they were keys and I believe that may be confusing. Paritybit (talk) 03:48, 10 September 2008 (UTC)
Binary?
[edit]It looks like Tries are not binary (Ie. A node doesn't have at most two children). This would be much more clear if the diagram featured a node with more than two children. (Whoops, forgot to sign, here it goes:) --Thenickdude 05:03, 2 September 2006 (UTC)
- True, that. I should fix the diagram. Deco 02:33, 2 September 2006 (UTC)
- I updated it so a couple nodes have three children. Superm401 - Talk 23:03, 8 March 2009 (UTC)
Knuth reference
[edit]When looking up the reference by Knuth (The Art of Computer Programming, Volume 3: Sorting and Searching). I couldn't find a third edition, only a second edition from March 1998 (first print)
More information can also be found on the publishers website: http://www.aw-bc.com/catalog/academic/product/0,1144,0201896850,00.html
Maybe someone with more experience (this is my first Wikipedia post) could verify this and modify the article.
Thanks in advance Jwk3402 13:04, 2 April 2007 (UTC)
Other names
[edit]Is discrimination tree another name for a trie? Wikipedia doesn't have an entry for D. T. and Google isn't helping. -- Ralph Corderoy 11:50, 28 May 2007 (UTC)
- Thanks for the hint. I added one reference that seems to say a discrimination tree uses a trie data structure. Better references would be appreciated. --DavidCary (talk) 15:15, 9 July 2016 (UTC)
Huffman coding
[edit]Maybe Huffman coding should be added to "See also", since the basic idea seems to be very similar to me. Subwy (talk) 20:22, 13 August 2008 (UTC)
- Yeah sort of. You can interpret a Huffman tree as being a trie, and Huffman coding as describing the search paths followed when you look up a sequence of characters in that trie. It's not a bad analogy. Dcoetzee 20:43, 13 August 2008 (UTC)
- I too think so, I am adding it to the article. --Jyoti (talk) 06:46, 15 November 2008 (UTC)
Why did Edward Fredkin choose that word?
[edit]Since he pronounced it homophonous to ‘tree’, didn't he realize that it was a pretty stupid choice, because that would make it impossible to distinguish the words in speech? If he was so desperate to combine ‘tree’ and ‘retrieve’, surely he could have done better? Shinobu (talk) 22:06, 5 October 2008 (UTC)
Example for implicit branch labels?
[edit]The article currently states that the label of the branches is often implicit in the ordering of the branches. I'm trying to think of a good example of this but all I can think of are binary prefix codes, which sort of use binary lookup trees anyway. All the examples in the article seem to assume every node has a map of its children (though my Haskell isn't all that up to scratch). Perhaps an explicit example would be useful... Wppds (talk) 09:05, 8 January 2009 (UTC)
If keys were made of us US-ASCII lower-case letters, and each node had an array of 26 child pointers, you wouldn't have to say "this character is a 'c' ", because you could know that from the pointer to the node being in the 3rd position. — Preceding unsigned comment added by JimJJewett (talk • contribs) 21:51, 7 May 2013 (UTC)
Section on lexicographic sorting using a trie is unclear and in some cases, wrong
[edit]Only a pre-order traversal can be used to sort the keys in lexicographic order. Furthermore, a lexicographic ordering of edge pointers within each node is required in order to use the pre-order traversal. In other words, the children of a node would have to be traversed in lexicographic order.
Please see this link for a reference: http://books.google.com/books?id=G_U8ghxhT4wC&pg=PA379&dq=lexographic+sorting+using+trie
Calvin (talk) 17:22, 18 February 2009 (UTC)
Some people overgeneralize from seeing examples where strings are only associated with the leaf nodes in a trie to believing, incorrectly, that strings can only be associated with the leaf nodes in a trie. When a trie represents overlapping strings, such as, "the", and, "there", the word, "the", will be associated with an interior node of the trie and will be encountered first and be printed first in a preorder traversal before the preorder traversal reaches the node that is associated with the word, "there", which itself might be associated with an interior node if there is a longer overlapping string represented by the trie, such as, "therefore".
Whether or not a postorder traversal of a trie will produce text in reverse lexicographic order depends on how the strings are represented. If the trie is acting an an index to strings with a pointer to each string, then a postorder traversal of a trie will produce text that is in reverse lexicographic order.
67.167.247.90 (talk) 20:40, 25 October 2020 (UTC) Edward Lee
Lexicographic ordering and post-order traversal
[edit]Calvin is right: the article mistakenly claims that post-order traversal of the trie will result in lexicographically-reversed output. That's quite misleading and wrong. A better, pedagogical approach might be to consider an Euler traversal (Euler traversal)(generalized to trees with more than 2 children) in this part of the discussion. An Euler tour of a tree structure covers both pre-order and post-order traversal of a tree structure in the same pass. Pre-order events can be used to construct and output the strings in lexicographical order but you also need to capture post-order events in order to know when to remove characters from the prefix string.
--Here2L8 (talk) 20:26, 5 April 2009 (UTC)
trie = DFA ?
[edit]Other than the confusing name, how is this different from a deterministic finite automaton? 12.108.187.222 (talk) 20:25, 7 May 2009 (UTC)
- It's a special case of a DFA -- precisely, it's a DFA which is a tree. --196.215.6.126 (talk) 14:35, 17 July 2010 (UTC)
- Generally (and technically), a trie is not a DFA because some (in practice it is usually most) states don't have all transitions defined. There is a natural DFA that corresponds to the trie, but this small distinction highlights the point of tries. It is a simple data structure that, for some use cases, is efficient for storing strings. Making the connection to DFAs to then drop it during implementation just complicates things. — xo Ergur (talk) 11:21, 7 August 2025 (UTC)
sorting
[edit]In the sorting section, the two statements
- Pre-order traversal is a kind of depth-first traversal. In-order traversal is another kind of depth-first traversal that is more appropriate for outputting the values that are in a binary search tree rather than a trie.
are confusing. Should one of them be breath-first traversal? Kowloonese (talk) 17:37, 30 October 2009 (UTC)
Breadth-first traversal of a trie would work if you are representing the digits of integers with variable lengths as paths in a trie. I have no citation for this, just my original insight. Otherwise, a depth-first traversal would be appropriate.
In a binary search tree, any descendant nodes down to the left branch of a given parent node contain values that are lower than the value in the parent node, and any descendant nodes down to the right branch of a given parent node contain values that are higher than the value in the parent node. For the strings, "The", "There", and, "Therefore", the word, "There", might be in the root node, and the word, "The", might be in the left child node, and, the word, "Therefore", might be in the right child node. Imagine there is a line connecting, "There", with, "The". Imagine there is another line connecting, "There", with, "Therefore":
- There
- The Therefore
An in-order traversal is usually implemented as a recursive depth-first traversal algorithm that attempts to go as far down the left branches of a binary tree as a possible. Upon returning from a visit down a left branch, the contents of the current node are printed. Then, the algorithm attempts to recursively go down the right branch. Pseudocode for an in-order traversal is:
- Procedure InOrder(p as Pointer)
- Begin
- If p is Null Then Return
- InOrder(p->Left)
- Print p->String
- InOrder(p->Right)
- End
So, in this example, the InOrder() procedure would be called with a pointer to the root node, which contains the value, "There." The procedure recursively calls itself with a pointer to the left node. The second instance of the procedure corresponds to the left child node that contains, "The". This instance of the procedure then calls itself recursively with a Null pointer for the left node. The third instance of the procedure then immediately returns, because the pointer is Null. Upon returning from the left branch, the second instance of the procedure prints the contents of the current node, "The". The second instance of the procedure then recursively calls itself with a Null pointer to the right branch. The fourth instance of the procedure immediately returns to the second instance of the procedure, because the pointer is Null. Upon returning from the right branch, the second instance of the procedure then returns to the first instance of the procedure which corresponds to the root node. Upon returning from the left branch, the first instance of the procedure prints the contents of the root node, "There." The first instance of the procedure then recursively calls a fifth instance of the procedure with a pointer to the right node, which contains, "Therefore". The fifth instance of the procedure then recursively calls itself with a Null pointer to the left node. The sixth instance of the procedure immediately returns, since the pointer is Null. Upon returning from the left branch, the fifth instance of the procedure then prints the value in the current node, "Therefore". The fifth instance of the procedure then recursively calls itself with a Null pointer to the right node. The seventh instance of the procedure then immediately returns, since the pointer is Null. Upon returning from the right branch, the fifth instance of the procedure that corresponds to the word, "Therefore", then returns to the first instance of the procedure corresponding to the root node. Upon returning from visiting the right branch, the first instance of the procedure corresponding to the root node then returns, finishing the in-order traversal.
A preorder traversal prints or otherwise processes the value associated with the current node in a tree first before visiting any descendant nodes. That is what has to be done to print in ascending lexicographic order the values represented by a trie where the strings:
- The
- There
- Therefore
are arranged as overlapping paths in the trie where the paths represent the individual digits or individual characters of the string prefixes. There are multiple options for how to represent the strings. For example, you can concatenate the string prefixes that are represented by a path that has been traversed from the root node down to a given node or have a pointer to a string in each node that corresponds to the path that has been traversed from the root node down to a given node. A trie doesn't have to be a binary trie, but if it were a binary trie, the pseudocode for a preorder traversal would be something like:
- Procedure PreOrder(p as Pointer)
- Begin
- If p is Null then Return
- Print p->String
- PreOrder(p->Pointer[0]) 'Recursively visit any nodes down the branch corresponding to string prefix digit zero.
- PreOrder(p->Pointer[1]) 'Recursively visit any nodes down the branch corresponding to string prefix digit one.
- End
A postorder traversal routine goes as deep as possible down a particular branch of a tree until no further descent is possible and then switches to the next available branch, recursively going down until no further descent is possible for any branch, and prints or otherwise processes the contents of the current node prior to returning. The text book version of postorder traversal that I learned depicted visiting the branches from the left to right, but for displaying the values associated with a trie in descending lexicographic order, the branches have to be visited from right to left, corresponding to the higher digital values to the lower digital values. The pseudocode for a postorder traveral of a binary trie would be something like:
- Procedure PostOrder(p as Pointer)
- Begin
- if p is Null then Return
- PostOrder(p->Pointer[1]) 'Recursively visit any nodes down the branch corresponding to string prefix digit one.
- PostOrder(p->Pointer[0]) 'Recursively visit any nodes down the branch corresponding to string prefix digit zero.
- Print p->String
- End
In-order traversal, preorder traversal, and postorder traversal are all depth-first traversal algorithms. I don't like using hyphens for spelling pre-order or post-order traversal, because it takes too long to type. If you visualize in your mind the sequence of nodes that are visited by these recursive algorithms, then you will see that depth-first traversal is equivalent to the wall following algorithm for traversing a maze by keeping one of your hands on a wall at all times, which will make the maze traveler visit all of the maze if there is no exit. You can see examples of real preorder and postorder traversal routines for a binary trie in the C language source code for my BRADSORT program, which is one of the first implementations of a trie-based radix sorting algorithm, which you can probably find with Google. The code has been commented out with an #ifdef but can be made active again.
-Edward Lee 67.167.247.90 (talk) 00:18, 15 November 2021 (UTC)
Compared to hash tables
[edit]The article currently says, “[…] tries have the peculiar feature that the time to insert, or to delete or to find is almost identical … (unlike say hash tables where deletion is extremely fast, but finding and especially insertion is considerably slower).“ The part in parentheses makes no sense. First of all, finding must be performed before deletion so deletion cannot be faster (and isn’t). It also shouldn’t be faster than insertion, neither in open nor in closed addressing. I’m removing that part since it’s wrong (or at best easily misunderstood), and also unnecessary. —Preceding unsigned comment added by 77.186.145.151 (talk) 15:51, 18 July 2010 (UTC)
- To be fair I could interpret this as referring to deletion via an iterator, which could maintain a pointer to the data item, in which case it does appear to represent an actual (but relatively unimportant) distinction between tries and hash tables. I support removing it. The distinction of actual importance is the prefix sharing by tries. Dcoetzee 18:33, 18 July 2010 (UTC)
- I think the phrase in parentheses is simply reversed -- Hash tables may take extraordinarily long to delete an element if using a closed hash, because the entire table needs to be re-hashed to settle any collisions that will no longer occur. That is, subsequent searches will need to know that they should consider the deleted table entry a possible collision. This can be fixed by using tombstone values, or instead using a chained/open hash table. —Preceding unsigned comment added by 216.239.45.4 (talk) 19:24, 25 August 2010 (UTC)
Is a trie a data structure or an algorithm?
[edit]The opening sentence says a trie is a data structure. But then the first section is headed Advantages relative to other search algorithms, and it begins, "Unlike most other algorithms, tries...".
So is a trie a data structure or an algorithm? On this talk page under trie = DFA ? an anonymous poster concludes that a trie is a DFA. That seems to be more accurate than either data structure or algorithm. IOLJeff (talk) 14:27, 3 July 2011 (UTC)
- It's a data structure of course. It stores data. Like other data structures, it comes with algorithms for accessing (searching) and updating it - that's what first section seems to be talking about. You could view a trie as a DFA. But note that you can insert or remove strings from tries at any time and this results in different DFA's, so this viewpoint is restricted only to immutable tries. -- X7q (talk) 18:18, 3 July 2011 (UTC)
Advantages/Disadvantages mess
[edit]The article is currently a real mess when it comes to trying to understand why and when Tries are used instead of hash tables, for example. There is the "advantages" section, and there is the "instead of other data structures" section, both trying to say the same thing, but both completely missing the point:
Tries are mostly (or only?) used to to hold a relatively-dense list of words - it could be a spell-checking dictionary (the only good example given in this article), a routing table (see Patricia Tree, a variant of trie), and so on. The advantages of tries over hash tables here are: 1. They often take less memory (as whole words aren't stored), 2. They are MUCH faster to build given a sorted word list (as well as easily saving a sorted word list), 3. While hash-tables can only do exact word matching, tries make it easy to do longest prefix match, approximate string matching, and so on.
195.110.40.7 (talk) 08:47, 13 March 2012 (UTC)
- There are also some statements in there for which we have to let's say make charitable assumptions in order for them to be true at all - for instance, the claim that searching a binary tree takes O(m log n) time, which assumes that comparing a key takes m time. Elsewhere the article describes hash tables as constant-time, which requires the opposite (and more customary) assumption that key-sized operations are constant-time. I suggest removing the entire advantages/disadvantages section. It does not seem to reflect nor contribute to a good understanding of what's actually going on in these data structures, and it is is uncited. 130.179.29.61 (talk) 19:18, 15 March 2013 (UTC)
Please verify the "A compiling Python version" version
[edit]The code uses a defaultdict, but then the algorithms sometimes rely on getting a KeyError, which is precisely what a defaultdict with a factory prevents. Perhaps setdefault would work better.
JimJJewett (talk) 21:59, 7 May 2013 (UTC)
The "compiling Python version" compiles but doesn't work indeed :) Like you say, the "remove" never hits the False clause because of the factory passed to defaultdict, and prefix goes out of range when going down to empty string. I've re-written it to work as folows, although the prefix method should be removed, or the lookup method should be re-written not to cache searches so that they make sense together. BACbKA (talk) 19:04, 12 May 2013 (UTC)
Since nobody seems to care, I'm moving my code into the article, even though I haven't addressed the prefix vs lookup issue. At least it works now... Thanks for the alert, JimJJewett! BACbKA (talk) 21:07, 11 June 2013 (UTC)
Confusion about keys
[edit]Would be clearer to change: Unlike a binary search tree, no node in the tree stores the key associated with that node
to
Unlike a binary search tree, no node in the tree stores the entire key associated with that node --129.42.208.183 (talk) 22:00, 20 January 2014 (UTC)
- Your formulation would suggest that nodes store part of the key. That's not true: it's the edges (links) that encode the keys, not the nodes. The nodes only carry values or "final" bits. QVVERTYVS (hm?) 22:33, 20 January 2014 (UTC)
Sorting section uses wrong traversal
[edit]I think the sentence "Output all keys in the trie by means of pre-order traversal, which results in output that is in lexicographically increasing order." uses the wrong traversal. After having a look at Tree_traversal i would use the depth-first|in-order variant. Additionaly there is no sorting of the branches of a node. And the traversal article uses strict left to right traversing. I can't see how you will get an ordered list this way. — Preceding unsigned comment added by 178.192.228.27 (talk) 08:11, 10 November 2014 (UTC)
- It is part of the definition of a trie that the branches are sorted: they correspond to symbols (characters) and those are assumed to be ordered, so you need to do a traversal that follows this ordering. As for in-order vs. pre-order, that makes no difference if you want to output only the keys, but if you want to output all their prefixes, you need to do it pre-order. The algorithm is as follows:
traverse(n : Trie node, s : string)
if n is a final node:
output s
for each symbol c of the trie's alphabet, in order:
if there is a child node of t associated with c:
traverse(child c of n, s + c)
- Skipping the check for final nodes produces the prefixes as well. QVVERTYVS (hm?) 11:09, 10 November 2014 (UTC)
- Sorry i missinterpreted the examples at the tree-traversal because they didn't work with strings. However in the example image at the top of the page a can not recognize a sorting. In the algorithms section i also can not see any comparisons to sort the branches, however i know nothing about haskell so i could be wrong. (Must the input be ordered before applying the algorithm?) --178.192.228.27 (talk) 15:17, 10 November 2014 (UTC) (Why can't i use my de-login for en-pages? Arrrrgh)
- The picture is a bit confusing as it indeed shows the children of some nodes in unsorted order. As for sorting the branches, that's usually done implicitly. A baseline implementation of a trie node in C would be
struct Node {
int final;
struct Node *children[256];
};
- A node
nhas a child for symbolciffn.children[c] != NULL. The Haskell example uses aMap, i.e., a BST, to store the children. That sorts them automatically. - The code examples should really be replaced by pseudocode, since now you have to know Haskell and Python to read them. The Python example also doesn't store the children in sorted order, because it puts them in a hash table. QVVERTYVS (hm?) 15:45, 10 November 2014 (UTC)
- A node
DFA without loops
[edit]Quote from the article:
- A trie can be seen as a deterministic finite automaton without loops.
I think this too weak, and it should be "a DFA which is a tree". Consider a DFA which is a simple diamond: e.g. states {1,2,3,4}, transitions 1->2 on "A", 1->3 on "B", 2->4 on "A", 3->4 on "B", with state 4 as the only accepting state. This definitely a loop-free DFA which accepts the language {"AA","BB"}. The graph is a DAG, but not a tree, so the DFA is not a trie. --Saforrest (talk) 02:00, 21 May 2015 (UTC)
- You're right. I updated the article to read "tree-shaped". In fact, every finite state machine that accepts a finite language is acyclic (because every cycle allows pumping), but not necessarily tree-shaped. QVVERTYVS (hm?) 13:48, 21 May 2015 (UTC)
"Lookup and membership are easily described"
[edit]..but unfortunately it's in Haskell. Crazy. How about pseudocode, or a more commonly used language?! 110.20.194.252 (talk) 08:33, 10 February 2017 (UTC)
Came here to say the same - I don't see other algorithms / data structures articles using this language -John Lunney (talk) 19:29, 3 December 2017 (UTC)
I rewrote it, hope that's better now. SamaelBeThouMyAlly (talk) 15:49, 3 January 2018 (UTC)
bitwise tries
[edit]The bitwise trie section indeed is a kind of incomplete. A link to this article may help: [[1]] — Preceding unsigned comment added by Pela3141 (talk • contribs) 13:43, 15 February 2021 (UTC)
In the example, why are there no arrows to "ten" or "inn"?
[edit]I don't understand why "ten" and "inn" are isolated in the example. I would expect arrows to point to them from "te" and "in", respectively.
A bit more description for self.values in Node strucutre?
[edit]First time reader of trie. I got stuck what the self.values meant in class Node.
From what I understood it can be self.isEnd to indicate if the word was inserted. Like if "apple" was inserted, the node for the letter "e" would indicate isEnd=True. In this way we can tell the word "app" wasn't inserted, it's just there as a prefix.
Can it also be used to indicate quantity like there are 4 apples and 2 oranges using an attribute called self.quantity?
If the above are true, can I add the following text under https://en.wikipedia.org/wiki/Trie#Algorithms section?
"self.values can be used to indicate an attribute like quantity (like there are 4 apples). We can also add an attribute called self.isEnd to indicate if a word was inserted (insert function needs to be modified accordingly for this work)." — Preceding unsigned comment added by Twoplants (talk • contribs) 09:46, 8 October 2021 (UTC)
Floating point as bit string
[edit]@David Eppstein: This is a good catch, thanks. Yeah, it can be represented as a IEEE 754 binary string. I've copy-edited per Reema (2014); I'm searching for sources that support the new revision (representation as binary string) to cite there. WikiLinuz {talk} 🍁 03:12, 13 March 2022 (UTC)
GA Review
[edit]| GA toolbox |
|---|
| Reviewing |
- This review is transcluded from Talk:Trie/GA1. The edit link for this section can be used to add comments to the review.
Reviewer: The Antediluvian (talk · contribs) 14:41, 17 April 2022 (UTC)
| Rate | Attribute | Review Comment |
|---|---|---|
| 1. Well-written: | ||
| 1a. the prose is clear, concise, and understandable to an appropriately broad audience; spelling and grammar are correct. |
The article needs an "Overview" section that summarizes the functioning of the data structure, including why it should be used, and where it should be used and where not to. The lists within the "Applications" section needs to be expanded a little. Article should also need to provide information regarding time-space complexities on the body. | |
| 1b. it complies with the Manual of Style guidelines for lead sections, layout, words to watch, fiction, and list incorporation. |
Currently it needs minor copy editing for expressions like "don't", "doesn't", etc. A few sentences should also needs to be written in encyclopedic tone, specifically sections "Implementation strategies" and "Replacing other data structures". "Operations" section also needs copy editing. | |
| 2. Verifiable with no original research: | ||
| 2a. it contains a list of all references (sources of information), presented in accordance with the layout style guideline. |
Article meets the verifiability criteria. | |
| 2b. reliable sources are cited inline. All content that could reasonably be challenged, except for plot summaries and that which summarizes cited content elsewhere in the article, must be cited no later than the end of the paragraph (or line if the content is not in prose). |
Article incorporates academic source. | |
| 2c. it contains no original research. |
Article meets WP:NOR criteria. | |
| 2d. it contains no copyright violations or plagiarism. |
Article meets this criteria. | |
| 3. Broad in its coverage: | ||
| 3a. it addresses the main aspects of the topic. |
Article covers the general details regarding a trie that's required for a data structure article. | |
| 3b. it stays focused on the topic without going into unnecessary detail (see summary style). |
I think it gives undue weight to the "External memory tries" section, which should be merged or expanded. The phrasing is also unclear. Like I pointed out already, the article needs an "overview" section. The "Applications" section should also needs to be split into separate paragraphs. | |
| 4. Neutral: it represents viewpoints fairly and without editorial bias, giving due weight to each. |
Article is neutrally worded. | |
| 5. Stable: it does not change significantly from day to day because of an ongoing edit war or content dispute. |
No active disputes or edit-wars lately. | |
| 6. Illustrated, if possible, by media such as images, video, or audio: | ||
| 6a. media are tagged with their copyright statuses, and valid non-free use rationales are provided for non-free content. |
The three images used in the article are tagged accordingly, and a summary is provided. | |
| 6b. media are relevant to the topic, and have suitable captions. |
Article meets this criteria. | |
| 7. Overall assessment. |
Currently the article does not meet the WP:GACR. | |
- I am putting this article WP:GAN/I#HOLD. I will reassess the article once the following changes were made. The Antediluvian 15:25, 17 April 2022 (UTC)
- Thanks for taking this. I'll make the requested changes soon. --WikiLinuz {talk} 🍁 15:57, 17 April 2022 (UTC)
- I've addressed the points 1 and 3b. --WikiLinuz {talk} 🍁 22:52, 17 April 2022 (UTC)
- I have reassessed the article. Can you please mention the time and space complexities of tries in the lead section? Currently, there doesn't seem to be a sentence talking about that information. The Antediluvian 09:27, 18 April 2022 (UTC)
- Yeah, I've added that. --WikiLinuz {talk} 🍁 22:31, 18 April 2022 (UTC)
- Thankyou. I will approve the article. Best regards, The Antediluvian 22:38, 18 April 2022 (UTC)
- Yeah, I've added that. --WikiLinuz {talk} 🍁 22:31, 18 April 2022 (UTC)
- I have reassessed the article. Can you please mention the time and space complexities of tries in the lead section? Currently, there doesn't seem to be a sentence talking about that information. The Antediluvian 09:27, 18 April 2022 (UTC)
- I've addressed the points 1 and 3b. --WikiLinuz {talk} 🍁 22:52, 17 April 2022 (UTC)
- Thanks for taking this. I'll make the requested changes soon. --WikiLinuz {talk} 🍁 15:57, 17 April 2022 (UTC)
Early stopping
[edit]The algorithms in this article do not end the path once there is only a single leaf beyond the node. This is probably so in the source cited (I don't have access to it), but it can make a trie much higher than actually necessary. This optimization is shown, fox example, in Sedgewick, Robert: Algorithms (Addison-Wesley, 1988). AmirOnWiki (talk) 14:44, 28 June 2025 (UTC)
Java implementation vs "pseudocode"
[edit]The code examples here used a pseudocode that I do not think is descriptive and in fact more confusing to readers than simply using a concrete Java implementation. My reasoning is as follows:
- This pseudocode is less descriptive. It does not present types and relies on syntactic quirks of other languages that ironically, make it less universal. For example, in the removal method,
key[1:]is an inherently Python-esque syntax. It also introduces:=while using=for equality checking, etc. There are also places where it relies on implicit (rather than explicit) constructs, likeNode()(which probably refers to calling a constructor, but isn't defined anywhere). - The claim that Java is more verbose or more noisy is inherently not true. This is not even opinion, but the pseudocode examples introduce noise keyword like "then", "do", "repeat", "end", etc. At that point why not just omit those keywords and rely on Python-style indentation, which is just as descriptive on where scope begins and ends? Furthermore, making the parameter
xexplicit is also additional noise when it could much more simply just be expressed as a method of the structure on itself. - Java (at the very least, in my opinion) has very simple syntax, and I don't see how the pseudocode provided is somehow more descriptive than the Java examples (in my opinion it is less descriptive but this is my opinion)
- The claim that "pseudocode has a wider appeal" is inherently meaningless because pseudocode is not an actual language. If the pseudocode isn't any more descriptive or simple than a real code example, at that point, why not just write the instructions in English? It feels like pseudocode is being used here just for the sake of using pseudocode. Nothing is gained from using it (it's not "easier to read" nor is it "syntactically simpler", it is being chosen simply for not being an actual language, etc.)
- Concrete code examples are not only better but can be demonstrated on the user's end if they choose to compile and produce the code themselves.
- The pseudocode examples are more tedious and difficult to maintain. Keywords are manually bolded and line counts are manually written rather than automatically generated.
I also wish to bring the attention that the code was previously written in Python (and there was a second implementation in Go, which was removed for being extraneous which I agree with) but was unilaterally changed to pseudocode by User:WikiLinuz, without any talk. I replaced the pseudocode with Java examples but @Ergur removed these changes. 50.101.202.178 (talk) 16:16, 28 October 2025 (UTC)
Within WikiProject Computer science, the consensus has generally been that where possible, algorithms should be presented in pseudocode.
see WP:PSEUDOCODE --WikiLinuz (talk) 16:55, 28 October 2025 (UTC)There are no universally accepted standards for presenting algorithms on Wikipedia.
is the first sentence of that section. Furthermore, the Trie is a data structure, not just some raw set of instructions. There are details of it that cannot otherwise be well-described. Your pseudocode is not language-agnostic either. What do we gain by writing it in pseudocode when other code examples are sufficiently descriptive (if not more) anyway? 2605:8D80:1392:FF71:CD14:B7C5:9B72:D5B0 (talk) 17:12, 28 October 2025 (UTC)- I implore you to finish reading the paragraph. It lays out the case for pseudocode quite well and states, in no uncertain terms, that the general consensus is to use pseudocode, where possible.
- If you think the code is unclear then your effort is best spent fixing it. And if you do, read MOS:ALGO. It seems User:WikiLinuz followed the "General guidelines" well, while writing the code.
- My opinion, in brief, is that Java is not simple. The "noise" I mentioned was referring to keywords like
public,final,static, et c. and concepts like classes and generics; not to mention whatever@SuppressWarnings("unchecked")means. — xo Ergur (talk) 08:05, 29 October 2025 (UTC)that the general consensus is to use pseudocode, where possible.
My case is that you are using pseudocode for the sake of using pseudocode, not because it is clearer or easier to understand. And have you seen final or static used once in my code? And how are you going to claim classes are unclear, when that is quite literally what the current code has (named structure)? And how are generics any more confusing than this random untyped value field we have now? 50.101.202.178 (talk) 15:12, 29 October 2025 (UTC)- Here is one line from the edit in question:
::::: public static final int ALPHABET_SIZE = 26; // Alphabet size of the Trie :::::
- It contains all three keywords I mentioned.
- Appealing to MOS:ALGO, specifically "Within WikiProject Computer science, the consensus has generally been that where possible, algorithms should be presented in pseudocode", I ask: Why is pseudocode not possible? — xo Ergur (talk) 16:06, 29 October 2025 (UTC)
- I think that above example is self-explanatory even if the reader is not familiar with those words. Anyone at least remotely familiar with computer science or programming would be able to interpret this as representing a constant intrinsic to that class.
- To answer your question, the page does not say to use pseudocode any cost, or to use it for the sake of all code snippets to be in pseudocode, and I imagine using pseudocode when it is not only no more descriptive but even less descriptive/expressive, is one such reason not to. Actions on a trie or any other such data type is very different from a graph tracing algorithm where the types of things are self-explanatory or the actions are easily expressed in pseudocode, but in this such example we have much more intricacies and details that ought to be expressed using actual constructs rather than abstract instructions. 50.101.202.178 (talk) 16:52, 29 October 2025 (UTC)
- I don't understand you point; Is the Trie data structure too complicated for pseudocode? I don't think you'll find much agreement there. At least among my peers, Tries are seen as one of the simplest data structures. Personally, I think it is easier to implement Tries than singly linked lists, since you never have to remove an internal node in a Trie.
- I looked over your Java code and noticed that it shows one of the strengths of pseudocode. In the Java code it is (implicitly) assumed that the alphabet is a subset of ASCII, namely the lower case English letters. Allowing for a more general alphabet would require discussing how to map the alphabet in question to and doing this in Java you would maybe use an interface, further showing how Java is not a simple language. This is not require in pseudocode; You can just assume that your alphabet is and focus on what matters. — xo Ergur (talk) 08:31, 30 October 2025 (UTC)
In the Java code it is (implicitly) assumed that the alphabet is a subset of ASCII, namely the lower case English letters.
How or why is this a problem? As opposed to the pseudocode that doesn’t even specify the length for the array? What is your point? There was a comment left on that constant to make it clear that the alphabet size can depend on whatever the chosen alphabet is. Or shouldALPHABET_SIZEjust be another implicitly defined constant that was never specified earlier, like the pseudocode example? 2620:101:F000:7C0:0:0:0:C2CF (talk) 08:54, 30 October 2025 (UTC)- In fact, I chose to use Java rather than C++ because Java is simpler than C++ while still being descriptive and expressive enough to demonstrate things like types, which are important. 2605:8D80:1395:18AF:C01:9946:449F:5185 (talk) 02:32, 1 November 2025 (UTC)
- Okay, I am going to give this another day or two to wait for a response before I go through with my changes. 2605:8D80:5825:1532:D58:188C:5434:D54 (talk) 01:18, 4 November 2025 (UTC)
- Why? The only two people to reply have been against this change. I stopped replying because you'd gotten combative and it wasn't constructive.
- All I can say is read MOS:ALGO and MOS:CODE. The take away should be clear: Pseudocode when possible. — xo Ergur (talk) 08:13, 4 November 2025 (UTC)
- Refusing to respond to a talk page is not a “win a dispute for free” button. With all due respect, the crux of your argument just sounds like “use pseudocode for the sake of using pseudocode”, not whether the pseudocode is actually more descriptive or not or whether the proposed code samples are less descriptive or somehow more esoteric to readers who may or may not have experience with the language in question. Even from a utilitarian perspective, the pseudocode example is even more difficult to maintain, with the line numbers being a separate text box which has to be manually updated to the number of lines. ~2025-31148-11 (talk) 13:53, 4 November 2025 (UTC)
- I think you should relax your expectation a little bit. You are trying to destructively edit the article in a manner that contradicts the Manual of Style. It's not helpful to think of this thread as a debate to win. It is a discussion where people can voice their opinion; A process that may take time. In that time I recommend reading MOS:CODE and MOS:ALGO. — xo Ergur (talk) 14:29, 4 November 2025 (UTC)
- You are now making untruthful claims about others, claiming that I am acting destructively or maliciously. If that were the case I would not bring attention to the fact that the line numbers are being manually written which is terrible for maintenance. No one claimed this was a debate or some kind of “competition” to win, least of all me. I simply remarked that refusing to respond will not silence the issue, and I did not make any personal attacks, only questioned in what way the pseudocode example was better at being alphabet-agnostic. Also, to claim I am contradicting the manual of style by suggesting to use an actual language rather than “pseudocode” (which has varying style for everyone and not a universally agreed-upon concept) is factually untrue, when it states clearly
There are no universally accepted standards for presenting algorithms on Wikipedia.
,To remain language-neutral, choose languages based on clarity, not popularity.
(which the Java code is pretty self-explanatory to read through even for someone without knowledge of that language), etc. ~2025-31242-73 (talk) 18:27, 4 November 2025 (UTC)- By "destructive" I just mean that it removes things.
- This discussion is not reliant on my daily input and it is not about getting the last word. It is fine if it goes a few days without addition; With time more people will see it.
- Quoting one line from MOS:ALGO isn't helpful. If you read it in whole it you'll see it has a clear preference for pseudocode. For example:
Within WikiProject Computer science, the consensus has generally been that where possible, algorithms should be presented in pseudocode. The use of pseudocode is completely language agnostic, and is more NPOV with respect to programming languages in general. Pseudocode also provides far more flexibility with regard to the level of implementation detail, allowing algorithms to be presented at however high a level is required to focus on the algorithm and its core ideas, rather than the details of how it is implemented. Finally, suitably high-level pseudocode provides the most accessible presentation of an algorithm, particularly for non-programmers.
— xo Ergur (talk) 08:26, 5 November 2025 (UTC)- This ~2025-31406-21 (talk) 13:55, 5 November 2025 (UTC)
- Then can you explain qualitatively how the Java code introduces features that make the code less accessible to readers? There is nothing really inherently unreadable about it.
- I am willing to make a counter-proposal, to use a syntax-highlighted block which would highlight important keywords and provide line numbering automatically, while remaining “pseudocode” as in avoiding using features that are “Java-specific”. The code will not have to be a specific language, though it will resemble an object-oriented C-style syntax language that is still agnostic enough to demonstrate concepts at a high-enough level. Is that acceptable? ~2025-31485-46 (talk) 19:49, 5 November 2025 (UTC)
- You are now making untruthful claims about others, claiming that I am acting destructively or maliciously. If that were the case I would not bring attention to the fact that the line numbers are being manually written which is terrible for maintenance. No one claimed this was a debate or some kind of “competition” to win, least of all me. I simply remarked that refusing to respond will not silence the issue, and I did not make any personal attacks, only questioned in what way the pseudocode example was better at being alphabet-agnostic. Also, to claim I am contradicting the manual of style by suggesting to use an actual language rather than “pseudocode” (which has varying style for everyone and not a universally agreed-upon concept) is factually untrue, when it states clearly
- I think you should relax your expectation a little bit. You are trying to destructively edit the article in a manner that contradicts the Manual of Style. It's not helpful to think of this thread as a debate to win. It is a discussion where people can voice their opinion; A process that may take time. In that time I recommend reading MOS:CODE and MOS:ALGO. — xo Ergur (talk) 14:29, 4 November 2025 (UTC)
- Refusing to respond to a talk page is not a “win a dispute for free” button. With all due respect, the crux of your argument just sounds like “use pseudocode for the sake of using pseudocode”, not whether the pseudocode is actually more descriptive or not or whether the proposed code samples are less descriptive or somehow more esoteric to readers who may or may not have experience with the language in question. Even from a utilitarian perspective, the pseudocode example is even more difficult to maintain, with the line numbers being a separate text box which has to be manually updated to the number of lines. ~2025-31148-11 (talk) 13:53, 4 November 2025 (UTC)
- Okay, I am going to give this another day or two to wait for a response before I go through with my changes. 2605:8D80:5825:1532:D58:188C:5434:D54 (talk) 01:18, 4 November 2025 (UTC)
- Please, just read MOS:CS. It is the manual of style. It's shows a clear preference to pseudocode along with guidelines on how to write it.
- Also, I removed the line numbers. Such a simple algorithm doesn't need line numbers. Honestly, no code on Wikipedia should need line numbers. — xo Ergur (talk) 08:34, 6 November 2025 (UTC)
- This is looping back to using pseudocode for the sake of using pseudocode, even if it is more verbose. Nothing in the example code is irrelevant or noise, the sole difference is that the code happens to be of a real language. ~2025-31803-89 (talk) 00:25, 7 November 2025 (UTC)
- Seeing as this discussion isn’t going anywhere, I’m going to use a WP:3O. ~2025-32305-73 (talk) 05:59, 9 November 2025 (UTC)
- I'm here from 3O, but I do not have the requisite knowledge to comment on programming languages. I just want to note that MOS:ALGO states that "Within WikiProject Computer science, the consensus has generally been that where possible, algorithms should be presented in pseudocode". This doesn't mean that all pages within the scope of WikiProject Computer Science are bound by its supposed guidelines, but that there needs to be a compelling reason to go against it. That is: it's certainly possible that other policies or particular circumstances might mean that going against the WikiProject consensus makes sense. WikiProjects are not arbiters of policies and guidelines, but groups of likeminded editors who focus on a particular topic and usually have useful advice for editing on that topic. Doesn't mean their consensus is binding on every article.
- The IP editor seems to present a rationale for why the psuedocode is inferior to a Java implementation. Like I said before, I do not have the technical expertise to evaluate this, but that argument should be addressed on its merits rather than referring to the local consensus of a WikiProject. Katzrockso (talk) 06:33, 9 November 2025 (UTC)
- Thanks for your response from WP:3O. And @Ergur, I generally agree that we should address the question of whether the Java implementation I proposed is clearer or not to the reader, compared to the existing pseudocode version. ~2025-32447-81 (talk) 00:48, 11 November 2025 (UTC)
- @Ergur Seeing as you are still active on Wikipedia but have not responded, I wish to remind you respectfully that you have the opportunity to voice your opinions on the merits of pseudocode/concrete code implementation and if you choose not to (which is within your right to do so) then I will proceed with restoring my changes. ~2025-32447-81 (talk) 22:14, 11 November 2025 (UTC)
- My opinion has been voiced. A great case for pseudocode is also made in MOS:ALGO.
- I feel obliged to mention that you are all but "respectful." This is the second time you have pressured me to respond with threats updating the article in a way that has been discussed here without you getting any consensus.
- There is nothing wrong with waiting to see if this is something people are actually interested in. I don't see why a "good article" needs to updated to suit the whims of a single editor. — xo Ergur (talk) 09:58, 12 November 2025 (UTC)
- Is a reminder that the world continues to revolve and talk page discussions proceed even when you refuse to acknowledge them, "a threat"? You already stated prior that you refused to respond because you perceived my response to be "combative". At most I have been defensive of my view, but I don't see how I've been disrespectful. I haven't made any personal attacks, insults, or rude remarks. The most I have done is use rhetorical questions, and question your choices and arguments.
A great case for pseudocode is also made in MOS:ALGO.
And how does this address myrationale for why the psuedocode is inferior to a Java implementation
as @Katzrockso said?There is nothing wrong with waiting to see if this is something people are actually interested in.
Do you believe that you are entitled to infinite time to delay changes proposed on a talk page? Do you assume that everyone else who potentially would be interested in a change going through or not would for certain side with you? ~2025-32447-81 (talk) 20:35, 12 November 2025 (UTC)- @Ergur I find this response here to be unbecoming of collaborative discussion - it appears to me to be WP:STONEWALLING any potential change to the article. Like I said before, I have no opinion on the merits of the change, but failure to engage in "substantive discussion of the issues related to the change", this falls right into the definition of stonewalling. Katzrockso (talk) 21:01, 12 November 2025 (UTC)
- I don't fully agree with the arguments advanced in the discussion above, but on the whole I think the pseudocode is a better approach that is likely to be more widely understood. Thanks, Suriname0 (talk) 16:34, 13 November 2025 (UTC)
- What makes you say this? Structure and syntax-wise there isn't a huge difference besides the fact that the pseudocode example resembles Python syntax whereas Java resembles C syntax, and Java forces explicit declaration of types. The pseudocode examples have the additional verbosity of manually specifying the otherwise implicit "this" parameter. ~2025-32447-81 (talk) 20:47, 13 November 2025 (UTC)
- I'd like to hear some more elaborated justifications for this, I feel like this would be an appropriate request when coming to decide on the merits of either choice. ~2025-34236-49 (talk) 01:53, 19 November 2025 (UTC)
- @Katzrockso any thoughts on what to do next, seeing as someone has provided an opinion but has not elaborated, especially after being prompted over the course of more than a week? ~2025-34236-49 (talk) 18:20, 21 November 2025 (UTC)
- @~2025-34236-49 if other editors do not respond, you can assume consensus by WP:SILENCE and make the change. If editors after that point continue to revert back changes without communication, you would have to start being up the point WP:CIR and perhaps ask for sanctions somewhere. Katzrockso (talk) 22:09, 21 November 2025 (UTC)
- Nothing has ever been reverted "without communication." Also, WP:SILENCE is not how I would describe this thread.
- How does it make sense to go through with the change when three out of three people who have taken a stance are opposed? — xo Ergur (talk) 06:54, 23 November 2025 (UTC)
- I have given you more than ample amount of time to communicate and actually address the points as had @Katzrockso in the WP:3O process, and you made the decision to consciously ignore the talk page even when prompted to multiple times.
- WP:SILENCE is more than appropriate to describe the thread when each individual who gave their opinion did not back up said opinion when given more than a week to. And what does
three out of three people who have taken a stance are opposed
have any weight on if none of these aforementioned people are willing to justify their position? This isn’t an opinion piece or presidential election where you can “vote” on your desired outcome without a justification. If I organised a thousand people to spam the Santa Claus talk page with “Santa Claus, the Easter Bunny, and the Tooth Fairy are real”, or “Norway is located in the Southern Hemisphere”, would that have any weight on anything without logical justifications? I am using nonsensical examples here, but I believe my point is decently illustrated by it nonetheless.
- At any rate, you’re edit warring now and have shown beyond a reasonable doubt that you aren’t interested in dialogue or coming to an agreeable conclusion to this conflict, and as such I will be involving dispute resolution and if necessary, other means. In no case are you entitled to delay a talk page dispute and its outcome indefinitely by refusing to communicate. ~2025-35874-73 (talk) 17:25, 23 November 2025 (UTC)
- @Katzrockso @Ergur See Wikipedia:Dispute resolution noticeboard#Trie and feel free to provide your summaries of the dispute at the linked section. ~2025-35909-34 (talk) 18:35, 23 November 2025 (UTC)
- Sorry, this was meant to link to WP:COMMUNICATE rather than WP:CIR, I haven't been checking this talk page very much. Just to clarify Katzrockso (talk) 23:47, 23 November 2025 (UTC)
- @~2025-34236-49 if other editors do not respond, you can assume consensus by WP:SILENCE and make the change. If editors after that point continue to revert back changes without communication, you would have to start being up the point WP:CIR and perhaps ask for sanctions somewhere. Katzrockso (talk) 22:09, 21 November 2025 (UTC)
Independent opinion
[edit]I saw this on the noticeboard. I am experienced in comp science. I think pseudocode would be far more preferable. Java is just one language. Algorithms may be written in many. And the reader may not know Java, but can understand pseudocode anyway. Algorithms are language Independent. That is all I have to say. Nothing more. Good day. Yesterday, all my dreams... (talk) 13:09, 27 November 2025 (UTC)
- I agree. The proposed Java implementation results in superfluous keywords (
public,static) and particular choices such as privileging the ASCII lowercase alphabet. None of these aid the reader's understanding of the topic compared to a clean, language-independent presentation of the ideas. Elestrophe (talk) 18:04, 27 November 2025 (UTC) - Let me address each of these points yet again, despite having addressed these exact points repeatedly in the discussion above:
Java is just one language.
But its syntax is not, it is universally understood. To give an example of exactly what I mean, let us take a look at this simple algorithm that finds the largest value in a list, using the same pseudocode style as the article. As you have declared your proficiency in computer science, I am sure you are aware of exactly what is happening in the following code.
findLargestValue(list): if not list or list.length == 0 then raise IllegalArgumentException("Array is empty") end if largest := list[0] for i in list do if list[i] > largest then largest = list[i] end if repeat return largest
- Meanwhile, in Java:
public static int findLargestValue(int[] list) { if (list == null || list.length == 0) { throw new IllegalArgumentException("Array is empty"); } int largest = list[0]; for (int i: list) { if (list[i] > largest) { largest = list[i]; } } return largest; }
- The logic is expressed in exactly the same way. There are no Java-specific quirks in the Java snippet. Just because it contains the keywords
public(which means that it is a publicly visible method) andstatic(which means that it belongs to no specific instance of any class) does not somehow make the code "inaccessible" to readers. Do you want to see what it would look like if I really did use Java-specific code? This is the same example using Java streams, and the logic has changed completely.
- The logic is expressed in exactly the same way. There are no Java-specific quirks in the Java snippet. Just because it contains the keywords
import java.util.Arrays; public static int findLargestValue(int[] list) { return Arrays.stream(list) .reduce((a, b) -> a > b ? a : b) .orElseThrow(() -> new IllegalArgumentException("Array is empty")); }
Algorithms are language Independent.
Meanwhile, the pseudocode examples in the article, are NOT language-independent. You will see that certain syntax choices, such as array slicing (key[1:], etc.) are very much Python syntax and not some "generic" pseudocode that could be any language theoretically. Furthermore, the pseudocode example shows no types for variables, parameters, or functions. If anything, this is precisely obscuring important details from a reader for no good reason. If you want truly "language-independent" pseudocode, just write the algorithm in plain English, with a few algorithmic concepts, just don't bother using programming language-resemblant syntax at all.The proposed Java implementation results in superfluous keywords
and the pseudocode example does not? What do words likedo,then,end if,repeat, etc. add to the understanding of the algorithm? It is quite literally noise that languages have evolved the use of brackets for now.particular choices such as privileging the ASCII lowercase alphabet
If you're so insistent on "not using any particular alphabet", then at least write something in the pseudocode that gives an inherent indication of this to the reader. Meanwhile in the proposed Java implementation I specifically left a comment saying that the size of the alphabet can be chosen depending on what alphabet is chosen. Just because the lowercase ASCII alphabet was chosen for the code, does not mean it is the only valid alphabet to choose, it is merely an example. Why do you assume readers are not intelligent enough to extrapolate this constant to whatever alphabet of their choosing? Frankly I'm not sure what the obsession with this point of contention is, when it's so obviously hardly relevant to anything.
- ~2025-36699-87 (talk) 22:44, 27 November 2025 (UTC)
- At this point your only option is doing an Rfc, given that 3rd opinion consensus is not with you, regardless of your explanations. Good luck with that. Yesterday, all my dreams... (talk) 02:53, 28 November 2025 (UTC)
- Nothing was or is stopping you from responding and defending your points just as I have done and will continue to do. I think that I have given a comprehensive defence and justification for things from your points, and no amount of random third parties repeating that "the Earth is flat" and refusing to back up their claims will change anything. ~2025-36699-87 (talk) 05:21, 28 November 2025 (UTC)
- Please read WP:HEAR about editing against consensus. Thanks. Yesterday, all my dreams... (talk) 07:10, 28 November 2025 (UTC)
- Please WP:PLAYPOLICY elsewhere if you have no intention to provide any constructive feedback to this dispute. WP:ICANTHEARYOU describes your behaviour exactly seeing as you've made no attempt to disprove any of my rebuttals. Thanks. ~2025-36699-87 (talk) 07:39, 28 November 2025 (UTC)
- Is consensus with me or with you? It is certainly with me. I will have no further comment. Yesterday, all my dreams... (talk) 09:04, 28 November 2025 (UTC)
- "With you or with me"? I thought you were an independent party that didn't have a stake in this dispute prior? Yes, go ahead and leave this discussion while continuing to throw the buzzword "consensus" around completely twisted of its actual interpretation because you have demonstrated that you have nothing to contribute and apparently no patience to read arguments and respond meaningfully. ~2025-36699-87 (talk) 09:25, 28 November 2025 (UTC)
- Is consensus with me or with you? It is certainly with me. I will have no further comment. Yesterday, all my dreams... (talk) 09:04, 28 November 2025 (UTC)
- Please WP:PLAYPOLICY elsewhere if you have no intention to provide any constructive feedback to this dispute. WP:ICANTHEARYOU describes your behaviour exactly seeing as you've made no attempt to disprove any of my rebuttals. Thanks. ~2025-36699-87 (talk) 07:39, 28 November 2025 (UTC)
- Please read WP:HEAR about editing against consensus. Thanks. Yesterday, all my dreams... (talk) 07:10, 28 November 2025 (UTC)
- Nothing was or is stopping you from responding and defending your points just as I have done and will continue to do. I think that I have given a comprehensive defence and justification for things from your points, and no amount of random third parties repeating that "the Earth is flat" and refusing to back up their claims will change anything. ~2025-36699-87 (talk) 05:21, 28 November 2025 (UTC)
- At this point your only option is doing an Rfc, given that 3rd opinion consensus is not with you, regardless of your explanations. Good luck with that. Yesterday, all my dreams... (talk) 02:53, 28 November 2025 (UTC)
Counterproposal
[edit]@Ergur I am willing to make a counterproposal, for the sake of not dragging this dispute longer than it needs to be:
- Continue using "pseudocode" (i.e., not any specific language), with the following changes:
- Add types to the code, for clarity
- Avoid useless scoping keywords that can be easily represented with braces for scope, such as
repeat,end if, etc. There will be no need for access specifiers such aspublic,static, etc. - Define in the code a constant for the alphabet size, but it does not have to be specified as 26. This can intentionally be left as a comment.
- Use syntax quirks that do not belong to any language. Right now I am drawing attention to string slicing syntax which is obviously Python-inspired, but is not universally obvious.
- Replace
xin the parameters with something likethisorself. ~2025-36699-87 (talk) 23:37, 28 November 2025 (UTC)
- By the way, if I do not hear an objection to this proposal in a week I'm proceeding with it as according to WP:SILENCE, as this is a completely different proposal to proposing replacing pseudocode with Java. ~2025-36699-87 (talk) 04:58, 29 November 2025 (UTC)
Add types to the code, for clarity
- I'm not opposed to adding input and output lines, as is described in MOS:ALGO. I want to point out that adding types does not automatically add clarity, see the Java implementation with its generics.
Avoid useless scoping keywords that can be easily represented with braces for scope, such as
repeat,end if, etc. There will be no need for access specifiers such aspublic,static, etc.- I think consistency across Wikipedia is preferable; I am familiar with articles omitting them entirely (Knuth-Morris-Pratt and the example in MOS:ALGO#Example), but I don't recall ever seeing braces in pseudocode. I would prefer they stay; I have used Python long enough to burn myself on whitespace-based control flow.
Define in the code a constant for the alphabet size, but it does not have to be specified as 26. This can intentionally be left as a comment.
- There is already. It appears in the definition of the Node structure. I agree that it should also appear in the for loop in Trie-Delete.
Use syntax quirks that do not belong to any language. Right now I am drawing attention to string slicing syntax which is obviously Python-inspired, but is not universally obvious.
- Perfectly reasonable. I don't think the Java example is the solution, though, unless it explicitly stated something like key.substring(1, key.length - 1) (although in Java it is still too slow).
Replace
xin the parameters with something likethisorself.- This is only helpful for people familiar with the jargon.
- All that said, I think this discussion will need to be repeat when a new version of the code has been written. So, while it is a bit much to call this a waste of time, I think we are duplicating work by doing this in the wrong order.
- In my Sandbox [2] I have 2 possible versions to work from. The first is adding onto the current version, with few alterations. The second is a higher level approach I thought my warrant a look too. They aren't quite finished. I will work more on them tomorrow morning; It is night now. — xo Ergur (talk) 00:22, 1 December 2025 (UTC)
- I don't see why using generics or labeling the stored type as
Tis a problem. If it's a problem that we gave an explicit size for the alphabet size, then it shouldn't be a problem if we don't specifyT, most people are capable of comprehending tparams or are conscious of what they would denote. ~2025-36699-87 (talk) 01:03, 3 December 2025 (UTC)
- I don't see why using generics or labeling the stored type as
Add types to the code, for clarity
- Support as long as it's unobtrusive.
Avoid useless scoping keywords that can be easily represented with braces for scope, such as
repeat,end if, etc. There will be no need for access specifiers such aspublic,static, etc.- I don't think there's any reason to switch between keywords and braces, they are functionally equivalent. We should stick with the MOS:ALGO recommendation here for consistency with other articles.
Define in the code a constant for the alphabet size, but it does not have to be specified as 26. This can intentionally be left as a comment.
- Support.
Use syntax quirks that do not belong to any language. Right now I am drawing attention to string slicing syntax which is obviously Python-inspired, but is not universally obvious.
- Support removing Python-style string slicing syntax.
Replace
xin the parameters with something likethisorself.- No preference.
- Elestrophe (talk) 06:10, 29 November 2025 (UTC)
I don't think there's any reason to switch between keywords and braces, they are functionally equivalent.
Most languages use C-style braces nowadays and even those that don't (like Python) don't use keywords to end scope, they define scope by indentation. There are plenty of other articles on Wikipedia that use pseudocode that do not use keywords to close scopes, and it's usually pretty easy to associate each scope closer with the block it belongs to. As mentioned by the WP:3O earlier in the talk section, MOS:ALGO is a suggestion, not a hard rule. ~2025-36699-87 (talk) 08:14, 29 November 2025 (UTC)- I understand MOS:ALGO is not a hard rule, but we should only override it if there is some particular reason to do so for this specific article. The arguments you are making would apply equally well to any algorithms article, so you should open an RFC to change MOS:ALGO instead of trying to override it one article at a time. Elestrophe (talk) 09:11, 29 November 2025 (UTC)
- Also to be clear MOS:ALGO states
If sufficiently clear, block closing keywords (end, repeat etc.) may be omitted.
So it's fine to omit the ending keywords, but I don't think replacing them with braces is constructive. Elestrophe (talk) 09:14, 29 November 2025 (UTC)- In that case I'll wait another day or two before proceeding with this. ~2025-36699-87 (talk) 23:09, 30 November 2025 (UTC)
- @Elestrophe Do you have an opinion on whether generics/template parameters are acceptable for use in pseudocode? I do not think they make things any less clear personally. ~2025-36699-87 (talk) 05:56, 4 December 2025 (UTC)
- My opinion is they should be used if and only if they make things more clear. "Not any less clear" would not be a good reason. Elestrophe (talk) 06:52, 4 December 2025 (UTC)
- I argue that they are, as they explicitly denote that the type may be substituted for anything. ~2025-36699-87 (talk) 01:01, 5 December 2025 (UTC)
- My opinion is they should be used if and only if they make things more clear. "Not any less clear" would not be a good reason. Elestrophe (talk) 06:52, 4 December 2025 (UTC)
- I am proposing the following edition, of which I have done my best to remove the Java-isms from. I believe this still captures the core essence of the algorithm without any language-specific quirks and is still understandable by non-programmers.
- The core structure of
Trie: class Trie<T> { static const int ALPHABET_SIZE = /* Alphabet size of the Trie */; Trie<T>[] children; // An array of length ALPHABET_SIZE of Tries T value; // The stored value of type T, of the trie node /** * Constructs a new instance of a Trie. */ Trie() { children = new Trie[ALPHABET_SIZE]; } }
- The method
Trie::find: /** * Finds the value from an associated key in the Trie. * * @param key The key to search for * @return The stored value associated with that key */ T find(this, String key) { Trie<T> x = this; for (int i = 0; i < key.length(); ++i) { if (x.children[key[i]] == null) { return null; } x = x.children[key[i]]; } return x.value; }
- The method
Trie::insert: /** * Inserts a value with the associated key into the Trie. * * @param key The key of the new Trie node. * @param value The stored value of the new Trie node. */ void insert(this, String key, T value) { Trie<T> x = this; for (int i = 0; i < key.length(); ++i) { if (x.children[key[i]] == null) { x.children[key[i]] = new Trie<T>(); } x = x.children[key[i]]; } x.value = value; }
- The method
Trie::remove: /** * Removes a node from the Trie. * * @param key The key of the node to remove * @return The node that was removed from the Trie */ Trie<T> remove(this, String key) { if (key.isEmpty()) { this.value = null; } else { if (children[key[0]] != null) { children[key[0]] = children[key[0]].remove(key.substring(1, key.length())); } } if (this.value != null) { return this; } for (Trie<T> child: children) { if (child != null) { return this; } } return null; }
- As always leave your ideas and input below. ~2025-38551-85 (talk) 22:11, 5 December 2025 (UTC)
- Oppose changing from pseudocode to Java, for the same reasons as I mentioned before. Elestrophe (talk) 05:48, 6 December 2025 (UTC)
- This isn't Java????? I quite literally did exactly what I said I would do above without any specific language quirks??? ~2025-36699-87 (talk) 06:34, 6 December 2025 (UTC)
- It now seems to me that this is not a "completely different proposal." I want to remind you that there is an ongoing DRN, that you filed, awaiting your input. — xo Ergur (talk) 12:33, 6 December 2025 (UTC)
- And why is that, because I don't use an absurd and frankly outdated way to express pseudocode? Or are you expecting that "pseudocode" MUST resemble Python or Ada? ~2025-36699-87 (talk) 17:38, 6 December 2025 (UTC)
- Oppose changing from pseudocode to Java, for the same reasons as I mentioned before. Elestrophe (talk) 05:48, 6 December 2025 (UTC)