To CID or not to CID

Site Announcements
Pietry
Senior Member
Posts: 328
Joined: 04 Dec 2007, 07:25
Location: Bucharest
Contact:

To CID or not to CID

Post by Pietry » 29 Jul 2009, 07:26

Lately on DC development hub, the guys have been really flaming about a long time issue that has been bothering every hubsoft developer. Why , how, when does DC++ ( and all derivatives ) display the annoying message : "x has the same cid as y, ignoring". I think every hubsoft developer ( and people can confirm ) seen the message while developing their piece of software.

First question that I asked myself is why the message appears. DC++ checks the users list. When some user is added ( either logs in and DC++ receives a new BINF from the hub, either DC++ logs in and receives a list of already connected users , a bunch of BINFs ), DC++ must also remember it and add it to the rightside panel and display it to the user. However, is it logical that two users cannot have the same client id ( CID ). The CID is unique and it has a purpose of solely identification of users. So, when the hub sends another BINF for another user (different SID ), with the same CID as a previous user ( without the first one disconnecting first ), it's clear to me that something is wrong. DC++ solution is to ignore the second user ( the one that appears in collision ), and displays the fault message.

Now that the question has an answer, let's see where is the problem here. While developing my own hubsoft, I had doubts that my software had a problem, not sending the correct messages, perhaps losing some messages on the way, losing some users on the way, or perhaps synchronization problems ( sending new BINF before sending IQUI for the old user for example ).

After working a lot to solve the problem, used all my karma and skills, did manage to reduce the problem to a minimum, but not completely. Finally, I abandoned because of deception.

Jan Vidar Krey has proposed that the client has a problem, because all current hubsofts have the same bug. It's really a small chance that all of us did the same mistake. He wrote a script that tests the case. The CID generation may be a problem ( see the 195-192 bits difference between 39 base32 chars and 192 bit CID ). I'm not yet sure what to think about the CID problem, but according to my logging using ADC Debug on DC++, it was very right when displaying the same CID problem. I analysed the log and indeed DSHub was sending bad messages ( new BINFs for CIDs already connected ). During my testing, I did not see any bad "same cid" message coming from DC++.

Some other argument that came to my eyes was that other clients do not show this message. However, I do not consider this argument being valid. Perhaps the other clients do not print it, ignore the problem or simply they aren't coded in a way that checks this kind of collisions.

The question, "To CID or not to CID", does not have an answer for now. The facts are that there might be a synchronization bug in all the hubsofts since it's a very common problem and a very hard one to spot, and another fact is for sure that the message occurs.


For the normal user, this message is just annoying and does not interfere with normal hub operations ( too little to be considered harmful ). The normal user might notice in the worst case that some users are not shown, and ghosts are shown instead ( in very small percent though - 0,5-1 % ). Next version of DC++ will move the "same cid" message from mainchat to syslog, and this will improve the user experience on ADC hubs for now.

As an update, arne has specified that indeed the end padding was a problem because last 3 bits were dropped in base32 decoding, so CIDs with only last 3 bits different were actually a collision. We will see in the future if the CID problem is more deep than initially thought but for now the conclusion is that the client is somehow ok and we go back to our initial beliefs that the hubsofts have a problem.

I do hope that the problem comes to light because it has been bothering me personally for a long time and all developers in general in the same manner.
Just someone

darkKlor
Senior Member
Posts: 100
Joined: 30 Dec 2008, 14:59

Re: To CID or not to CID

Post by darkKlor » 31 Jul 2009, 04:25

One issue I found when I was initially developing Netfraction was that almost ALL my users had the same CID. It turned out that this was because we were downloading a DC++ release (BCDC++ at the time), configuring it to connect to the hub on our LAN, including creating a default user account name, and then distributing this copy to new users. So, since the first run of the installation technically happened on only one computer, the CID would be the same across all users.

I think the most likely cause is either some similar situation, or the client not producing a suitably random series of bytes to generate the PID. For PID generation (in C#) I use:

Code: Select all

byte[] pid = new byte[24];
Random rand = new Random();
rand.NextBytes(pid);
It seems fairly decent at producing random numbers, as long as they aren't produced in the same second of space-time (the default constructor for the Random class uses a time-dependent seed value).

On the hub-side, it's critical that the ID parameter of the INF message is only handled during the Identify stage e.g. from the NetfractionHub INF handler:

Code: Select all

switch (param.ParameterCode)
{
    case "ID":
        // Only handle Client ID changes when the user connects. It cannot be changed later.
        if (node.Connection.State == NetfractionHub.RemoteMachine.ConnectionState.Identify)
        {
            node.Identity.ClientId = param.ParameterValue;
and then later in the INF handler...

Code: Select all

switch (node.Connection.State)
{
    case NetfractionHub.RemoteMachine.ConnectionState.Identify:
        lock (node)
        {
            if (!string.IsNullOrEmpty(node.Identity.ClientId))
            {
                if (node.Identity.ClientId != ConnectionManager.GenerateClientId(node, Base32.Decode(node.Identity.PrivateId)))
                {
                    ConnectionManager.SendStatusMessageToNode(node, Factory.FactorySTA.StatusSeverity.FatalError, Factory.FactorySTA.StatusErrorCode.InvalidPrivateID);
                    ConnectionManager.DisconnectNode(node);
                    return false;
                }

                if (ConnectionManager.GetConnectedNodeByClientId(node.Identity.ClientId) != null)
                {
                    ConnectionManager.SendStatusMessageToNode(node, Factory.FactorySTA.StatusSeverity.FatalError, Factory.FactorySTA.StatusErrorCode.ClientIDTaken);
                    node.Connection.Disconnect();
                    return false;
The key checks run there are:
1) is the connection in the Identify state? if so, set the client ID
2) is the connection in the Identify state? if so, lock the current node (prevents cross-threaded access)
a) is the client ID empty? if so, disconnect with STA 'Required INF Field Missing Or Bad' FMID (not shown)
b) does the client ID supplied match the hash of the supplied private ID? if not, disconnect with STA Invalid PID
c) does a connected node have the client ID? if so, disconnect with STA CID Taken

I haven't noticed any issues with these checks. Check 2(b) is the only one which cares about base32 or the session hash algorithm, and it's not even related to the 'same cid' issue. All the other checks are on connection state or between string values, with the addition of an object lock to prevent synchronisation issues. Does anybody see any weakness in them? I really do think a client-side CID generation issue is more likely, simply because creating a random PID and performing a hash on it is a bit more complex than the simple checks above. Having said that, any failure to consider threading issues in the hub could certainly ruin your day in a litany of unknown ways.

blastbeat
Member
Posts: 53
Joined: 10 Jan 2008, 19:56
Contact:

Re: To CID or not to CID

Post by blastbeat » 31 Jul 2009, 10:23

whats about stripping the ID part from inf before sending it to all clients? is there any requirement for the clients to know the cids of each other? if not, stripping it saves net traffic and avoids the error

Pretorian
Site Admin
Posts: 214
Joined: 21 Jul 2009, 10:21

Re: To CID or not to CID

Post by Pretorian » 31 Jul 2009, 11:11

There's no requirement that ID (or any other INF parameter) is present, as far as "requirements". However, you're back to square one as far as user identification.

adrian_007
Senior Member
Posts: 126
Joined: 06 Jan 2008, 13:00

Re: To CID or not to CID

Post by adrian_007 » 31 Jul 2009, 11:12

we need cid to identification on multiple hubs... that's the main diff between nmdc and adc (in nmdc we uses nick which is kinda broken)

Dj_Offset
Member
Posts: 53
Joined: 15 Sep 2008, 21:48
Location: adcs://adcs.uhub.org:1511
Contact:

Re: To CID or not to CID

Post by Dj_Offset » 01 Aug 2009, 18:33

Why is the CID needed these days? We already have the SID?
The original idea was to work around issues with download queues, and people on multiple hubs might have multiple nick names, which caused problems.
However, back in those days we did not have hashed files, so basically if you wanted to download a file it was downloaded from user/path/to/file instead of "TTH:xxxx...".

The CID might still prove useful for distributed networks and TLS - but only if it gets fixed from a security point of view.

darkKlor
Senior Member
Posts: 100
Joined: 30 Dec 2008, 14:59

Re: To CID or not to CID

Post by darkKlor » 02 Aug 2009, 01:54

I use the CID to uniquely identify clients between sessions...

It lets the user change their nickname whenever they like, and connect from different IP addresses, and my hub still knows who they are i.e. registered or not.

The only time the CID should change is when a client is first installed i.e. an upgrade will not change it, but going from say, DC++ to BCDC++ will mean a new CID, and uninstalling a client in a manner that removes all settings will lead to a new CID when it is reinstalled.

Having said that... if the hub is running ADCS and is issuing certificates to all registered users, then I see no need for a CID to be sent to the hub in that case! The client would still need one though to connect to other hubs as a non-registered user.

darkKlor
Senior Member
Posts: 100
Joined: 30 Dec 2008, 14:59

Re: To CID or not to CID

Post by darkKlor » 02 Aug 2009, 01:59

Addendum: A TLS Certificate is much more transportable than a CID. Just drop the .crt file into your certificates path when you change hub-software or do somethin like a clean install of your OS.

blastbeat
Member
Posts: 53
Joined: 10 Jan 2008, 19:56
Contact:

Re: To CID or not to CID

Post by blastbeat » 02 Aug 2009, 09:40

darkKlor wrote:I use the CID to uniquely identify clients between sessions...

It lets the user change their nickname whenever they like, and connect from different IP addresses, and my hub still knows who they are i.e. registered or not.
for this you dont need a CID. and in my experience, most hubowners/users want to reg via nick because no one understands this "CID thing"; in some clients its even impossible to copy paste the CID

darkKlor
Senior Member
Posts: 100
Joined: 30 Dec 2008, 14:59

Re: To CID or not to CID

Post by darkKlor » 02 Aug 2009, 10:39

in some clients its even impossible to copy paste the CID
Well I wrote a hub command to get around that :P
!getcid <nick>

Locked