An Unknown DBA blog: Troubleshoot Oracle datapump import to AWS RDS

Any Cloud is sold to be "click and play", "one click to the Cloud", name it . . . like if a magic AI would transfer all your application and data to a magic place named "the Cloud" in a minute by just pressing a big button:

Even if we must admit that it is relatively easy to move to any Cloud vendor (for obvious reasons), sometimes things go wrong and you need to go beyond that big button and see what's behind the scene to troubleshoot it.

This blog will explore an Oracle datapump to an AWS RDS instance scenario which is a pretty classic and easy process once you are familiar with the AWS RDS specifities which are well documented. It is also a good opportunity to learn how to use Oracle dbms_datapumpwhich I personally never used before !

So once you have created your target user(s), tablespace(s) on your RDS instance the same way you would do with any Oracle database:

SQL> create bigfile tablespace THE_TABLESPACE datafile size 100M autoextend on maxsize unlimited ;
SQL> CREATE USER THE_USER IDENTIFIED  BY XXXXX DEFAULT TABLESPACE THE_TABLESPACE TEMPORARY TABLESPACE TEMP QUOTA UNLIMITED ON THE_TABLESPACE PROFILE DEFAULT;
SQL> GRANT CREATE SESSION TO THE_USER;
SQL> GRANT CREATE JOB TO THE_USER;
SQL> GRANT CREATE PROCEDURE TO THE_USER;
SQL> GRANT CREATE SEQUENCE TO THE_USER;
SQL> GRANT CREATE TABLE TO THE_USER;
SQL> GRANT RESOURCE to THE_USER;
SQL>

you have transfered your dumpfile from S3 to your RDS instance:

SQL> SELECT rdsadmin.rdsadmin_s3_tasks.download_from_s3(
  p_bucket_name  =>  'my_bucket',   
  p_directory_name =>  'IMP')
  AS TASK_ID FROM DUAL;
TASK_ID
------------------
1570068680626-20
SQL> SELECT text FROM table(rdsadmin.rds_file_util.read_text_file('BDUMP','dbtask-1570068680626-20.log')) ;
TEXT
-----------------------------------------------------------------------------------------
2019-10-03 02:19:07.206 UTC [INFO ] This task is about to download the Amazon S3 object or
objects in /rdsdbdata/userdirs/01 from bucket name my_bucket and key a_dumpfile.
SQL>

Note that your instance needs the S3_INTEGRATION role to achieve this and avoid the below error:

SQL> SELECT text FROM table(rdsadmin.rds_file_util.read_text_file('BDUMP','dbtask-1570068680626-20.log')) ;
TEXT
--------------------------------------------------------------
2019-10-03 02:11:20.660 UTC [ERROR] The DB instance doesn't have credentials to access the specified Amazon S3 bucket. To grant access, add the S3_INTEGRATION role to the DB instance.
2019-10-03 02:11:20.663 UTC [INFO ] The task failed.
SQL>

your file is well in your "IMP" directory:

SQL> select * from table(RDSADMIN.RDS_FILE_UTIL.LISTDIR('IMP')) order by mtime;
FILENAME                 TYPE  FILESIZE     MTIME
------------------  ---------- ---------- ---------
a_dumpfile.dmp           file  12.0608E+10 03-OCT-19

by the way, it seems that in some version, the admin user has no WRITE privilege in the default DATA_PUMP_DIR then we sometimes have to create one on our own -- in doubt, I wonder if we shouldn't always create a new directory:

SQL> SELECT rdsadmin.rdsadmin_s3_tasks.download_from_s3(
  p_bucket_name  =>  'my_bucket',   
  p_directory_name =>  'DATA_PUMP_DIR')
  AS TASK_ID FROM DUAL; 
SELECT rdsadmin.rdsadmin_s3_tasks.download_from_s3(
  *
ERROR at line 1:
ORA-20003: Current user does not have WRITE privilege for directory DATA_PUMP_DIR
ORA-06512: at "RDSADMIN.RDSADMIN_S3_TASKS", line 97
ORA-06512: at "RDSADMIN.RDSADMIN_S3_TASKS", line 109
SQL> exec rdsadmin.rdsadmin_util.create_directory(p_directory_name => 'IMP') ;
PL/SQL procedure successfully completed.
SQL> select * from table(RDSADMIN.RDS_FILE_UTIL.LISTDIR('IMP')) ;
FILENAME    TYPE     FILESIZE    MTIME
-------- ---------- ---------- ---------
01/       directory    4096    03-OCT-19
SQL>

then now that we have our button's pedestal, let's finish it with some lines of . . . dbms_datapump (this is OK, this is provided in the official doc) and push the button (well . . . ENTER)!:

SQL> declare
h1 number ;
BEGIN
h1 := DBMS_DATAPUMP.OPEN(operation => 'IMPORT', job_mode => 'SCHEMA', job_name=>null);
DBMS_DATAPUMP.ADD_FILE(handle => h1, filename =>'DUMPFILE_NAME', directory =>'DIRECTORY_NAME', filetype => dbms_datapump.ku$_file_type_dump_file);
DBMS_DATAPUMP.ADD_FILE(handle => h1, filename => 'LOGFILE_NAME', directory => 'DIRECTORY_NAME', filetype => dbms_datapump.ku$_file_type_log_file);
DBMS_DATAPUMP.START_JOB(handle=>h1);
END;
/
declare
*
ERROR at line 1:
ORA-39002: invalid operation
ORA-06512: at "SYS.DBMS_DATAPUMP", line 5594
ORA-06512: at line 7

and here is the accident, you get this invalid operation error stack not very verbose on the root cause of the issue which demystifies that magic button a bit. Then here, alertlog won't help, neither will dba_datapump_jobs.

I then tried few pieces of code found here and there on the Oracle documentation and other websites to try to get more information about this naughty invalid operation error with no success.
And hopefully, AWS support provided the below code where you just have to customize the highlighted part of it:

set serveroutput on ;
DECLARE
  ind NUMBER;              -- Loop index
  spos NUMBER;             -- String starting position
  slen NUMBER;             -- String length for output
  h1 NUMBER;               -- Data Pump job handle
  percent_done NUMBER;     -- Percentage of job complete
  job_state VARCHAR2(30);  -- To keep track of job state
  le ku$_LogEntry;         -- For WIP and error messages
  js ku$_JobStatus;        -- The job status from get_status
  jd ku$_JobDesc;          -- The job description from get_status
  sts ku$_Status;          -- The status object returned by get_status
BEGIN
  h1 := DBMS_DATAPUMP.OPEN(operation => 'IMPORT', job_mode => 'SCHEMA', job_name=>null);
  DBMS_DATAPUMP.ADD_FILE(handle => h1, filename =>'DUMPFILE_NAME', directory =>'DIRECTORY_NAME', filetype => dbms_datapump.ku$_file_type_dump_file);
  DBMS_DATAPUMP.ADD_FILE(handle => h1, filename => 'LOGFILE_NAME', directory => 'DIRECTORY_NAME', filetype => dbms_datapump.ku$_file_type_log_file);
  DBMS_DATAPUMP.SET_PARAMETER(h1,'TABLE_EXISTS_ACTION','REPLACE');
    begin
    dbms_datapump.start_job(h1);
    dbms_output.put_line('Data Pump job started successfully');
    exception
      when others then
        if sqlcode = dbms_datapump.success_with_info_num
        then
          dbms_output.put_line('Data Pump job started with info available:');
          dbms_datapump.get_status(h1,
                                   dbms_datapump.ku$_status_job_error,0,
                                   job_state,sts);
          if (bitand(sts.mask,dbms_datapump.ku$_status_job_error) != 0)
          then
            le := sts.error;
            if le is not null
            then
              ind := le.FIRST;
              while ind is not null loop
                dbms_output.put_line(le(ind).LogText);
                ind := le.NEXT(ind);
              end loop;
            end if;
          end if;
        else
          raise;
        end if;
  end;
 percent_done := 0;
  job_state := 'UNDEFINED';
  while (job_state != 'COMPLETED') and (job_state != 'STOPPED') loop
    dbms_datapump.get_status(h1,
           dbms_datapump.ku$_status_job_error +
           dbms_datapump.ku$_status_job_status +
           dbms_datapump.ku$_status_wip,-1,job_state,sts);
    js := sts.job_status;
-- If the percentage done changed, display the new value.
     if js.percent_done != percent_done
    then
      dbms_output.put_line('*** Job percent done = ' ||
                           to_char(js.percent_done));
      percent_done := js.percent_done;
    end if;
-- Display any work-in-progress (WIP) or error messages that were received for
-- the job.
      if (bitand(sts.mask,dbms_datapump.ku$_status_wip) != 0)
    then
      le := sts.wip;
    else
      if (bitand(sts.mask,dbms_datapump.ku$_status_job_error) != 0)
      then
        le := sts.error;
      else
        le := null;
      end if;
    end if;
    if le is not null
    then
      ind := le.FIRST;
      while ind is not null loop
        dbms_output.put_line(le(ind).LogText);
        ind := le.NEXT(ind);
      end loop;
    end if;
  end loop;
  dbms_output.put_line('Job has completed');
  dbms_output.put_line('Final job state = ' || job_state);
  dbms_datapump.detach(h1);
  exception
    when others then
      dbms_output.put_line('Exception in Data Pump job');
      dbms_datapump.get_status(h1,dbms_datapump.ku$_status_job_error,0,
                               job_state,sts);
      if (bitand(sts.mask,dbms_datapump.ku$_status_job_error) != 0)
      then
        le := sts.error;
        if le is not null
        then
          ind := le.FIRST;
          while ind is not null loop
            spos := 1;
            slen := length(le(ind).LogText);
            if slen > 255
            then
              slen := 255;
            end if;
            while slen > 0 loop
              dbms_output.put_line(substr(le(ind).LogText,spos,slen));
              spos := spos + 255;
              slen := length(le(ind).LogText) + 1 - spos;
            end loop;
            ind := le.NEXT(ind);
          end loop;
        end if;
      end if;
END;
/

And the magic happens when you execute this code:

Exception in Data Pump job
ORA-39002: invalid operation
ORA-31694: master table "xxxxxxxx"."SYS_IMPORT_SCHEMA_06" failed to
load/unload
ORA-31644: unable to position to block number 3667498 in dump file
"/rdsdbdata/userdirs/01/a_dumpfile.dmp"
PL/SQL procedure successfully completed.

as it gives you the error code which allowed me to find out that the issue was a corrupted dumpfile ! This is indeed a 100+ more lines PL/SQL script but you can troubleshoot your Oracle datapump import to AWS RDS very easily which is the main thing !

I hope this will help you as much as it helps me ! as a reminder, the code is also on my GitHub repository for a better copy and paste experience.

4 comments:

AnonymousOctober 7, 2019 at 10:16 PM
Hi there,

nice article,
don't forget the S3_INTEGRATION option and role for your RDS instance,
otherwise you can not access the package rdsadmin_s3_tasks as it is not created.

And you can check the log file from import with :

select * from table(RDSADMIN.RDS_FILE_UTIL.read_text_file(p_directory=>'DATA_PUMP_DIR',p_filename=>''));

That is useful when you have to much errors.
hongMay 19, 2021 at 11:52 PM
How do we check the completion of the download automatically?