TABL - Displaying data in tables

Ideas for ADC may be presented here for others to review and point out flaws or further improve the idea.
Forum rules
If you have an account on the wiki, remember to update the ADC Proposals page for new ideas.

http://dcbase.org/wiki/ADC_Proposals_list
pR0Ps
Junior Member
Posts: 29
Joined: 05 Dec 2010, 11:35

TABL - Displaying data in tables

Post by pR0Ps » 12 May 2011, 04:23

This extension allows the hub and clients to send data in table format to other clients supporting the feature. It uses a slightly modified version of the table formatting portion of HTML. This will greatly help release bots and the like that struggle with formatting using tabs. One value that's a little too long/too short breaks the entire layout.

A client must send 'ADTABL' in the initial SUP to signify it supports the extension.

This extension will be used as an extension to the 'MSG' command, making it available in mainchat, chatrooms, PMs, etc. It will be used only as a feature broadcast. This will also allow for falling back to using the traditional tabs and newlines to separate data if the client doesn't support the TABL formatting. The sender will have to parse the markup and generate the equivalent table using newlines and tabs and send that as well for the non-TABL supporting clients.

Extension: TABL

Context: F, T
Message type: F

Flags:
NA | Table name
TD | Table data

The table name is optional and consists of an escaped string.
The table data is the data that the table holds. This will be in traditional HTML formatting with some modifications. Instead of the TH/TR/TD tags, just H,R, and D tags (header, row and data) will be used with an 'E' prefixing them to end the tag. It will be sent as an escaped string and does not need newlines to make the markup human-readable (See example).

The data must not contain the characters used in defining the tags. This means that the tag characters ('<' and '>') will have to be escaped. Clients that support the TABL extension must also recognize '\<' and '\>' as valid escaped strings.

The hub does not check the table for adherence to standards or verify that the table formatting is correct. The markup will be parsed by clients supporting the TABL extension as best it can (not unlike a web browser does).

Example:

Code: Select all

FMSG ASID +TABL NAExample\sTable TD\<R\>\<H\>Header\s1\<EH\>\<H\>Header\s2\<EH\>\<ER\>\<R\>\<D\>Row\s1,Cell\s1\<ED\>\<D\>Row\s1,Cell\s2\<ED\>\<ER\>
FMSG ASID -TABL Example\sTable:\nHeader\s1[TAB]Header\s2\nRow\s1,Cell\s1[TAB]Row\s1,Cell\s2

Suggestions welcome.

FlipFlop™
Junior Member
Posts: 23
Joined: 17 Apr 2009, 08:29

Re: TABL - Displaying data in tables

Post by FlipFlop™ » 13 May 2011, 14:55

Finally, nice addition! About time clients supporting at least a limited set of HTML tags (greylink taking the lead here). I don't understand why it would be needed to make a modification to the HTML formatting though.

Also, it could be interesting to develop an HTML or RTF extension which would include table formatting among a lot of other formatting options.

This isn't just interesting for ADC though, NMDC could benefit from this as well: http://www.flexhub.org/forum/index.php?topic=190.0

pR0Ps
Junior Member
Posts: 29
Joined: 05 Dec 2010, 11:35

Re: TABL - Displaying data in tables

Post by pR0Ps » 13 May 2011, 18:01

I'm hesitant to recommend that ADC support any and all kinds of formatting for the reason that I could see it getting abused. Nobody wants blinking 32 point bright red text appearing in mainchat (damn <blink> tag). However, I think table formatting doesn't make spamming/getting attention any easier and provides a clear benefit the the users. The way I was imagining it was that it would look pretty much the same as the tab and newline tables, only with properly lined up columns.

Also, I modified the HTML spec a bit to cut down on bandwidth (there could be a lot of tags, depending on the table). Per tag this implementation is 1 or 2 bytes better than the HTML spec. The tags remove redundant information from the name (TR = table row, its in the TABL command therefore we know its a table, just need the R for row) and use 'E' instead of the '\\' (escaped '\') to save a byte per closing tag.

I've thought about it a bit more and there is still a lot of room for improvement. I'm changing the extension spec so it uses the following format:

Opening tags are of the format '<[H/R/D]', while closing tags just use the '>' (closes the most recently opened tag, no matter what it is).

Using that style, here is the example from the first post:

Code: Select all

FMSG ASID +TABL NAExample\sTable TD\<R\<HHeader\s1\>\<HHeader\s2\>\>\<R\<DRow\s1,Cell\s1\>\<DRow\s1,Cell\s2\>\>
Since each '<' always has exactly one identifying character after it, and the '>' always completes to most current tag, this format is still easily parseable, and is 6 bytes smaller per tag set than the previous formatting (in this tiny example it saved 36 bytes). In a table with 10 columns and 50 rows, this would save 3.3KB.

I should add that only one table should be sent per MSG, multiple tables require multiple commands.

FlipFlop™
Junior Member
Posts: 23
Joined: 17 Apr 2009, 08:29

Re: TABL - Displaying data in tables

Post by FlipFlop™ » 13 May 2011, 21:54

Since it won't be pure HTML anyway, to really save bandwidth you might consider using this syntax: http://www.mediawiki.org/wiki/Extension:SimpleTable

andyhhp
Junior Member
Posts: 30
Joined: 18 Feb 2010, 17:44
Location: England

Re: TABL - Displaying data in tables

Post by andyhhp » 15 May 2011, 21:01

If you are going to implement tables then can I suggest that that includes rowspan and colspan as well.

As for bandwidth, I suggest that a better longterm solution would be for hubs and clients to implement ZON/ZOFF. For my hubsoft where everything is sent in UDP packets, we implement this behind the scenes.

~Andrew

pR0Ps
Junior Member
Posts: 29
Joined: 05 Dec 2010, 11:35

Re: TABL - Displaying data in tables

Post by pR0Ps » 25 Jun 2011, 19:21

ZON and ZOFF would be a good idea for clients that support it, but for the clients that don't (and to a lesser extent, the ones that do), the message size should still be kept as small as possible without sacrificing features.

Changes to previous proposal:
  • Add parameters to tags
  • Remove the <H> tag (just use BT AC params)
  • Remove the NM and TD flags (send just table data)
  • Put a top-level tag in the data like the <table> tag in html.
  • Multiple tables sent per message
Adding parameters to tags:
To add support for multiple parameters in tags there needs to be an indication that there are no more parameters and the data is starting. This is why I'm proposing that a closing bracket be added to the opening tag. This would give the opening tags the format '<[R/D][params]> (where params holds a list of space-seperated, single-character flags, much like the ADC protocol)', while closing tags still just use the '>' (closes the most recently opened tag, no matter what it is). This format allows for all the essential information to be communicated as well as allowing for adding multiple parameters.

Proposed parameter guide:
  • A = align
    • L=left
    • R=right
    • C=center
    • [default]=L
  • V = valign
    • T=top
    • M=middle
    • B=bottom
    • [default]=T
  • W = wrap
    • F=False
    • T=True
    • [default]=T
  • B = bold
    • F=False
    • T=True
    • [default]=F
  • C = colspan
    • [number]
    • [default]=1
  • R = rowspan
    • [number]
    • [default]=1
Note that all parameters are inherited from the parent tag (except row/colspans) and don't need to be respecified unless they need to be changed (like HTML). Also note that I'm not proposing a graphical table display, just a formatted text-based display so a lot of HTML table tags don't apply.

Removing the header tag:
Since it is now possible to add parameters to tags, it seems redundant to use a header tag, which just bolds and centers the content. To achieve the same effect, just specify bold=true and align=center in the parameters of a data tag (<H>test> --> <DBT AC>test>).

Removing the flags:
Since table titles and other data can just be included within the broadcast (outside a top-level tag), there isn't any point in having them identified by flags. Taking the option away from the client and putting in the hands of the sender means that the table will look more consistent across multiple clients, which is what this extension aims to do.

Top-level tag:
Having a top-level tag will allow for 'default' table parameters to be set for the entire table. It would have the format '<T[params]>'. This will also allow for multiple tables to sent in one broadcast (which is good for clients that support ZLIF). Nesting tables inside each other(having a new top-level tag inside a table cell) is considered valid. Text that doesn't reside inside a top-level tag should be considered to be just normal text. This means that information that relates to the table, but isn't contained within it (title, footnotes, etc) can be included in the same broadcast as the table itself. A new line is always implied after closing a top-level tag if there is more data to display.



Example:

Code: Select all

FMSG ASID +TABL Title\sText\n\<TVM\>\<R\>\<DR2\sBT\sAC\>Header\>\<D\>Cell1*\>\>\<R\>\<DAR\>Cell2\>\>\>*Footnote
This is a rough approximation of the HTML code:

Code: Select all

Title Text</br>
<table align="middle">
	<tr>
		<th rowspan="2">Header</th>
		<td>Cell1*</td>
	</tr>
	<tr>
		<td align="right">Cell2</td>
	</tr>
</table></br>
*Footnote
Getting better?

pR0Ps
Junior Member
Posts: 29
Joined: 05 Dec 2010, 11:35

Re: TABL - Displaying data in tables

Post by pR0Ps » 26 Jun 2011, 01:15

Pretorian suggested that instead of introducing new escape characters, to just make sure the input is sanitized using XML standards.

This means that outside of tags (user text):
  • [<] becomes [<]
  • [>] becomes [>]
  • [&] becomes [&]
  • ['] becomes [&apos;]
  • ["] becomes ["]
With this new rule, for clarity I think that the tag parameters should be separated by an '&' instead of a space. This makes it easier to see which characters are parameters, as well as reducing the byte count by one (not really big deal, but the first point is still valid)

New example:

Code: Select all

FMSG ASID +TABL Title\sText\n<TVM><R><DR2&BT&AC>Header><D>Cell1*>><R><DAR>Cell2>>>*Footnote
I'm currently working on a HTML <-> TABL format converter

pR0Ps
Junior Member
Posts: 29
Joined: 05 Dec 2010, 11:35

Re: TABL - Displaying data in tables

Post by pR0Ps » 29 Jun 2011, 16:27

HTML <-> TABL converter is finished: http://pastebin.com/hjGmsb7x

Report any bugs you find

Toast

Re: TABL - Displaying data in tables

Post by Toast » 30 Jun 2011, 07:18

paste it in here dont link either as a txt file or just in a post

pR0Ps
Junior Member
Posts: 29
Joined: 05 Dec 2010, 11:35

Re: TABL - Displaying data in tables

Post by pR0Ps » 04 Jul 2011, 02:14

The TABL extension draft documentation has been written (it's changed a lot from the ideas posted in this thread). Until it has been reviewed and accepted, the draft is here: http://storage.webatu.com/TABL.html If you have something to add to it, something you think is wrong or could be done better, post it.

Also, a few improvements to the TABLConverter script:

Code: Select all

###################################################
# TABLConverter v1.2
#
# Author: pR0Ps
# Description: Converts between HTML-formatted tables and TABL-formatted tables
# TABL formatting draft spec: http://www.adcportal.com/forums/viewtopic.php?f=55&t=753
#
# Limitations:
# -No syntax validation
# -Doesn't look at inherited tags
# -Bold tags (<B></B>) don't translate from HTML
#
# v1.0
#  -Initial code
# v1.1
#  -Forgot to rename a variable (*facepalms*)
#  -Added missing warning message
# v1.2
#  -Added support for bolding when converting to HTML
#  -Better support for unrecognized HTML tags
###################################################

import re

def TablEscape (s):
    t = s.replace ('\n', '\\n')
    t = t.replace (' ', '\\s')
    return t

def TablUnEscape(s):
    start = 0
    t = ""
    while start < len(s):
        i = s.find('\\', start)
        if i == -1 or i > len(s)-2:
            t += s[start:]
            break
        t += s[start:i]
        if s[i+1] == 'n' or s[i+1] == 'N':
            t += '<BR>'
        elif s[i+1] == 's' or s[i+1] == 'S':
            t += ' '
        elif s[i+1] == '\\':
            t += '\\'
        else:
            t += s[i] + s[i+1]
        start = i+2
    return t
    

def tablTagParser (tag):
    out = ""
    end = ""

    cen = False
    bold = False
        
    tag = tag.upper()
    params = tag[2:-1].split("&")
    tag = tag[0:2]

    #parse params
    for x in params:
        if x == '':
            continue
        elif len(x) < 2:
            print("WARNING: Invalid param encountered (length < 2): '" + x + "'")
        elif x[0] == "A":
            if x[1] == "L":
                out += " align='left'"
            elif x[1] == "R":
                out += " align='right'"
            elif x[1] == "C":
                cen = True
            else:
                print("WARNING: Unrecognized value for param '" + x[0] + "': " + x[1])
        elif x[0] == "V":
            if x[1] == "T":
                out += " valign='top'"
            elif x[1] == "M":
                out += " valign='middle'"
            elif x[1] == "B":
                out += " valign='bottom'"
            else:
                print("WARNING: Unrecognized value for param '" + x[0] + "': " + x[1])
        elif x[0] == "W":
            if x[1] == "T":
                out += " wrap"
            elif x[1] == "F":
                out += " nowrap"
            else:
                print("WARNING: Unrecognized value for param '" + x[0] + "': " + x[1])
        elif x[0] == "B":
            if x[1] == "T":
                bold = True
            elif x[1] == "F":
                bold = False
            else:
                print("WARNING: Unrecognized value for param '" + x[0] + "': " + x[1])
        elif x[0] == "C":
            out += " colspan='" + x[1:] + "'"
        elif x[0] == "R":
            out += " rowspan='" + x[1:] + "'"
        else:
            print ("WARNING: Unrecognized param: '" + x[0] + "'")

    #parse tag type
    if tag == "<T":
        out = "<TABLE border='1' width='100%'" + out + ">"
        end = "</TABLE>"
    elif tag == "<R":
        out = "<TR" + out + ">"
        end = "</TR>"
    elif tag == "<D":
        if bold and cen:
            out = "<TH" + out + ">"
            end = "</TH>"
        elif cen:
            out = "<TD" + out + " align='center'>"
            end = "</TD>"
        elif bold:
            out = "<TD" + out + "><B>"
            end = "</B></TD>"
        else:
            out = "<TD" + out + ">"
            end = "</TD>"
    else:
        print("WARNING: Invalid tag encountered: '" + tag + "'")
        out = ""

    return out, end 

def htmlTagParser (tag):
    out = ""
    tag = tag.upper()

    #parse tag type
    i = tag.find(' ')
    if i == -1:
        i = tag.find('>')
    if tag[:i] == "<TABLE":
        out = "<T"
    elif tag[:i] == "<TR":
        out = "<R"
    elif tag[:i] == "<TD":
        out = "<D"
    elif tag[:i] ==  "<TH":
        out = "<DAC&BT" #doesn't check for inherited values
    elif tag[:i] == "<BR":
        return "\\n"
    else:
       print ("WARNING: Unrecognized tag: " + tag)
       return ""

    #parse tag params
    params = re.findall(r" +([a-zA-Z]+) *= *([\'\"])([a-zA-Z0-9_]+)\2", tag)
    for x in params:
        if x[0] == "ALIGN":
            if x[2] == "LEFT":
                out += "&AL"
            elif x[2] == "RIGHT":
                out += "&AR"
            elif x[2] == "CENTER":
                out += "&AC"
            else:
                print("WARNING: Unrecognized value for param '" + x[0] + "': " + x[2])
        elif x[0] == "VALIGN":
            if x[2] == "TOP":
                out += "&VT"
            elif x[2] == "MIDDLE":
                out += "&VM"
            elif x[2] == "BOTTOM":
                out += "&VB"
            else:
                print("WARNING: Unrecognized value for param '" + x[0] + "': " + x[2])
        elif x[0] == "WRAP":
            out += "&WT"
        elif x[0] == "NOWRAP":
            out += "&WF"
        elif x[0] == "COLSPAN":
            out += "&C" + x[2]
        elif x[0] == "ROWSPAN":
            out += "&R" + x[2]
        else:
            print("WARNING: Unrecognized parameter: '" + x[0] + "'")

        if len(out) > 2 and out[2] == '&': #messy, but works
            out = out[:2] + out[3:]
        
    return out + ">"


def toHtml (tabl):
    tags = []
    parsed = 0 #chars gone through
    out = ""
    while parsed < len(tabl):
        start = tabl.find("<", parsed)
        end = tabl.find(">", parsed)
        if end == -1:
            out += TablUnEscape(tabl[parsed:])
            break
        elif start == -1 or end < start: #tag is ending
            out += TablUnEscape(tabl[parsed:end]) + tags.pop()
        else: #starting a new tag       
            out += TablUnEscape(tabl[parsed:start])
            end = tabl.find(">", parsed) #find end of tag start

            #add converted tag to output
            temp1, temp2 = tablTagParser(tabl[start:end+1])
            tags.append(temp2)
            out += temp1
            
        parsed = end+1;
    
    return out

def toTabl (html):
    out = ""
    ignoreNext = False
    parsed = 0
    while parsed < len(html):
        start = html.find("<", parsed)
        end = html.find(">", parsed)
        if start == -1 or end == -1:
            out += html[parsed:]
            break
        elif html[start+1] == "/": #end tag
            out += html[parsed:start]
            if not ignoreNext:
                out += ">"
            else:
                ignoreNext = False
        else:
            temp = htmlTagParser(html[start:end+1])
            if temp == "":
                ignoreNext = True
            out += html[parsed:start] + temp

        parsed = end+1

    return TablEscape(out)
        
def main():
    infile = input ("Enter the filename to convert from: ")

    #read data
    try:
        f = open(infile, "r")
        data = f.read()
        f.close
    except Exception as e:
        print (e)
        return
    outfile = input ("Enter the filename to write to: ")
    temp = ""         
    while temp != "1" and temp != "2": 
        temp = input ("Convert options:\n(1) HTML to TABL\n(2) TABL to HTML\n")
    if temp == "1":
        data = toTabl(data)
    else:
        data = toHtml(data)

    #debug
    #print (data + '\n')
    
    #write data
    try:
        f = open(outfile, "w")
        f.write(data)
        f.close()
        print ("Output written to",outfile)
    except Exception as e:
        print (e)
    
main()
print("Exiting...");

Locked