Geocoding Problems Using Soundex With Brazilian Addresses (Doc ID 1492124.1)

Last updated on FEBRUARY 08, 2017

Applies to:

Oracle Spatial and Graph - Version 11.2.0.3 and later
Information in this document applies to any platform.

Symptoms

The soundex appears not to be working:

DECLARE
 v SDO_GEO_ADDR;
 v2 SDO_GEO_ADDR;
BEGIN
 V := SDO_GEO_ADDR();
 V.STREETNAME := 'R Elena';
 V.MATCHMODE := 'RELAX_BASE_NAME';
 V.MUNICIPALITY := 'São Paulo';
 v.Region := 'SÃO PAULO';
 V.COUNTRY := 'BR';
 v2 := SDO_GCDR.GEOCODE_ADDR('GC_ORACLE', v);
 DBMS_OUTPUT.PUT_LINE('---------------------------------');
 DBMS_OUTPUT.PUT_LINE('Nome do lugar (POI):'||v2.PlaceName);
 DBMS_OUTPUT.PUT_LINE('Nome da rua:'||v2.StreetName);
 DBMS_OUTPUT.PUT_LINE('Nome da rua (intersecção):'||v2.INTERSECTSTREET);
 DBMS_OUTPUT.PUT_LINE('Cep 1:'||v2.PostalCode);
 DBMS_OUTPUT.PUT_LINE('Cep 2: '||v2.POSTALADDONCODE);
 DBMS_OUTPUT.PUT_LINE('Cep Completo: '||v2.FullPostalCode);
 DBMS_OUTPUT.PUT_LINE('Número da casa:'||v2.HouseNumber);
 DBMS_OUTPUT.PUT_LINE('Nome base da rua:'||v2.BaseName);
 DBMS_OUTPUT.PUT_LINE('Tipo da rua:'||v2.StreetType);
 DBMS_OUTPUT.PUT_LINE('Prefixo da rua:'||v2.StreetPrefix);
 DBMS_OUTPUT.PUT_LINE('Unidade secundária:'||v2.SecUnit);
 DBMS_OUTPUT.PUT_LINE('Settle:'||v2.Settlement);
 DBMS_OUTPUT.PUT_LINE('Cidade:'||v2.Municipality);
 DBMS_OUTPUT.PUT_LINE('Estado:'||v2.Region);
 DBMS_OUTPUT.PUT_LINE('Segment id:'||v2.EdgeId);
 DBMS_OUTPUT.PUT_LINE('Long:'||v2.LONGITUDE);
 DBMS_OUTPUT.PUT_LINE('Lat:'||v2.latitude);
 DBMS_OUTPUT.PUT_LINE('MatchCode:'||v2.MatchCode);
 DBMS_OUTPUT.PUT_LINE('MatchMode:'||v2.MatchMode);
 DBMS_OUTPUT.PUT_LINE('MatchVector:'||v2.MatchVector);
end;
/

The correct street name should be 'R. Helena' and not 'Av Eulina'. Why do we have some mistakes like that? That's my real doubt. Is it soundex the best way to geocode for brazilian portuguese? Is there a way to adjust that?
The REGION problem is related to the table GC_PARSER_PROFILES. There we have a SECTION_LABEL called 'REGION_LIST' and an entry like this:

OUTPUT_KEYWORD | KEYWORDS
SP | MDSYS.SDO_KEYWORDARRAY('SP','SÃO PAULO')

That's why 'SÃO PAULO' works and 'SAO PAULO' doesn't. But it's really hard for us to declare all the possibilities of a region and we would like to use some kind of phonetic treatment in here.

But the real problem occurs with the areas declared in gc_area_bra. For example: Municipality 'São Bernardo do Campo'. When we use 'São Bernardo do Canpo' (N instead of M) on the request the address is not found.

Summarizing:
  - Soundex makes some odd mistakes with brazilian portuguese;
  - Region and other areas are not even using soundex.

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms