Oracle VARNUM/NUMBER encoding in C

Just a short example C implementation of Oracle VARNUM/NUMBER type encoding, in case you could not use library functions. I didn’t found it anywhere when needed (and afterwards I found it doesn’t helps me at all due to little ‘problem’ in ora :() and Oracle documentation is pretty unclear about how-to do it.

 

typedef	struct
{
	uint8_t len;
	uint8_t exp;
	uint8_t man[20];
} VARNUM_t;
static void setVARNUM(VARNUM_t *varnum, int64_t value)
{
	int8_t digits = 0;
	uint64_t u_value = 0;
	/* init */
	memset(varnum, 0, sizeof(*varnum));
	varnum->len	= 1;	//start len
	/* check sign */
	if(value >= 0)
	{
		u_value		= value;
		varnum->exp	= trunc(log(u_value) / log(100)) + 128 + 65;
	}
	else
	{
		u_value		= value * - 1;
		varnum->exp	= trunc(log(u_value) / log(100)) + 128 + 65;
		varnum->exp	= ~varnum->exp;
	}
	/* count value digits */
	digits = trunc(log(u_value) / log(10));
	digits = ((digits / 2) + 1);
	/* mantisa */
	for(; digits >= 0 && varnum->len <= 20; digits--)
        {
                uint64_t v_tmp = 0;
                /* prevent INT overflow for too much digits */
                if(digits > 0)
		{
			uint64_t v_pow = powl(100, digits - 1);
			v_tmp = ((u_value / v_pow) / 100) % 100;
		}
		else
			v_tmp = u_value % 100;
		if(!v_tmp)
			continue;
		/* +1 for positive, subtract 101 for negative */
		v_tmp = (value < 0 ? (101 - v_tmp) : (v_tmp + 1));
                varnum->man[varnum->len - 1] = v_tmp;
		varnum->len++;
	}
	/* terminator byte for negative value */
	if(value < 0 && varnum->len <= 20) 	{
                varnum->man[varnum->len - 1] = 102;
		varnum->len++;
	}
	return;
}

NUMBER is just the same as VARNUM but without len member, thus is one byte shorter.

Note: this is just a quick implementation that works, there are some easy optimizations possible…

perl utf8 and using Digest functions

I’ve implemented new neat feature (to store unique content only once in cache) to my perl based etl tool and suddenly it started to print sometimes ‘Wide character in subroutine entry perl warning in sha1_hex call. As if this was not enough processed content after being stored in cache  started to be utf-8 corrupted in comparition to the one stored in the cache.

It took few funny hours with of playing with perl, till I’ve found that sha1_hex function somehow destroyed parameter content but in such a beautiful way, that it was almost impossible to detect it. What the best is the cached content was output of LibXML toString() function, but the XML tree itself (or what) has been also corrupted. Well it must one of that  ‘perl secrets’.

Afterwards Google given me some explanation of this activity – I’ve found similar problem reported for md5_sum function from digest::md5 package.

So finally to fix that problem, one must call encode_utf8() on sha1_hex parameter to let sha1_hex work on this copied content.