Help & Manual authoring tool
The Length function doesn't return the number of characters of a MBCS but the number of bytes, so here we present the AnsiLength function

Determining the actual length of a ANSI string (MBCS)

Copyright © 2000 Ernesto De Spirito

LMD-Tools 6.1 - 300+ components for various development tasks!

Introduction

The Length function returns the length of a string, but it behaves differently according to the type of the string. For the old short strings (ShortString) and for long strings (AnsiString), Length returns the number of bytes they take, while for wide (Unicode) strings (WideString) it returns the number of wide characters (WideChar), that is, the number of bytes divided by two.

In the case of short and long strings, in Western languages one character takes one byte, while for example in Asian languages some characters take one and others two bytes. For this reason, there are two versions of almost all string functions, one of great performance that only works with single-byte character strings (SBCS) and another -less performant- one that also works with strings where a character can take one or two bytes (DBCS) that are used in applications distributed internationally. This way we have functions like Pos, LowerCase and UpperCase on one side and AnsiPos, AnsiLowerCase and AnsiUpperCase on the other. Curiosly there is no AnsiLength function that returns the number of characters in a DBCS.

AnsiLength (Draft)

Then here it goes a function that returns the number of characters in a double-byte character string:

uses SysUtils;

function AnsiLength(const s: string): integer;
var
  i, n: integer;
begin
  Result := 0;
  n := Length(s);
  i := 1;
  while i <= n do begin
    inc(Result);
    if s[i] in LeadBytes then inc(i);
    inc(i);
  end;
end;

AnsiLength (Final)

Naturally, this function is not optimized. We are not going to mess with assembler, but at least we can use pointers:

uses SysUtils;

function AnsiLength(const s: string): integer;
var
  p, q: pchar;
begin
  Result := 0;
  p := PChar(s);
  q := p + Length(s);
  while p < q do begin
    inc(Result);
    if p^ in LeadBytes then
      inc(p, 2)
    else
      inc(p);
  end;
end;
JfControls Library - for Delphi and C++ Builder
Copyright © 2000/2006 Ernesto De Spirito.   All rights reserved.