Parsing a CSV file in PL/SQL
The ability to parse a CSV file in PL/SQL seems like a simple requirement and one would think that you could either a) easily implement it yourself, or b) find some examples of it on the web. Well if you have tried option A, you probably realized it gets real tricky when you have commas and double quotes in your actual data as well as having them as your deliminators and optionally enclosed by characters as well. Plus all that substr‘ing and instr‘ing can really hurt your head after a while. If you tried option B, then you probably discovered that there are some solutions out there, but they all seems to either incomplete, or just overly complex.
So I decided to write my own simple, yet complete CSV parser in PL/SQL. It handles all data, both optionally enclosed by some character or not, as well as both DOS (CR+LF) and UNIX (LF only) end-of-line file formats. And all this in less than 100 lines of code (with comments) and with only three distinct calls to substr() and NO calls to instr().
I wanted to share this in hopes that others find it useful.
create or replace procedure parse_csv(
p_clob clob,
p_delim varchar2 default ',',
p_optionally_enclosed varchar2 default '"' ) is
--
CARRIAGE_RETURN constant char(1) := chr(13);
LINE_FEED constant char(1) := chr(10);
--
l_char char(1);
l_lookahead char(1);
l_pos number := 0;
l_token varchar2(32767) := null;
l_token_complete boolean := false;
l_line_complete boolean := false;
l_new_token boolean := true;
l_enclosed boolean := false;
--
l_lineno number := 1;
l_columnno number := 1;
begin
loop
-- increment position index
l_pos := l_pos + 1;
-- get next character from clob
l_char := dbms_lob.substr( p_clob, 1, l_pos);
-- exit when no more characters to process
exit when l_char is null or l_pos > dbms_lob.getLength( p_clob );
-- if first character of new token is optionally enclosed character
-- note that and skip it and get next character
if l_new_token and l_char = p_optionally_enclosed then
l_enclosed := true;
l_pos := l_pos + 1;
l_char := dbms_lob.substr( p_clob, 1, l_pos);
end if;
l_new_token := false;
-- get look ahead character
l_lookahead := dbms_lob.substr( p_clob, 1, l_pos+1 );
-- inspect character (and lookahead) to determine what to do
if l_char = p_optionally_enclosed and l_enclosed then
if l_lookahead = p_optionally_enclosed then
l_pos := l_pos + 1;
l_token := l_token || l_lookahead;
elsif l_lookahead = p_delim then
l_pos := l_pos + 1;
l_token_complete := true;
end if;
elsif l_char in ( CARRIAGE_RETURN, LINE_FEED ) and NOT l_enclosed then
l_token_complete := true;
l_line_complete := true;
if l_lookahead in ( CARRIAGE_RETURN, LINE_FEED ) then
l_pos := l_pos + 1;
end if;
elsif l_char = p_delim and not l_enclosed then
l_token_complete := true;
elsif l_pos = dbms_lob.getLength( p_clob ) then
l_token := l_token || l_char;
l_token_complete := true;
l_line_complete := true;
else
l_token := l_token || l_char;
end if;
-- process a new token
if l_token_complete then
dbms_output.put_line( 'R' || l_lineno || 'C' || l_columnno || ': ' ||
nvl(l_token,'**null**') );
l_columnno := l_columnno + 1;
l_token := null;
l_enclosed := false;
l_new_token := true;
l_token_complete := false;
end if;
-- process end-of-line here
if l_line_complete then
dbms_output.put_line( '-----' );
l_lineno := l_lineno + 1;
l_columnno := 1;
l_line_complete := false;
end if;
end loop;
end parse_csv;
/
And here is a little test procedure to show it working. I have made the end-of-line different for each like to demonstrate this will work with all EOL terminators. In real-life (I hope) your CSV file will have just one.
declare
l_clob clob :=
-- DOS EOL
'A,B,C,D,E,F,G,H,I' || chr(13) || chr(10) ||
-- Apple up to OS9 EOL
'1,"2,3","1""2","""4,",",5"' || chr(13) ||
-- Acorn BBD and RISC OS EOL
'6,"this is a ""test",""",8","9"",","10,"""' || chr(10) || chr(13) ||
-- Unix and OS X EOL
'normal,"commas,,,in the field","""enclosed""","random "" double "" quotes","commas,,, "" and double """" quotes"' || chr(10) ||
-- Line with EOF only
'",F""",,,,abcde';
begin
parse_csv( l_clob );
end;
/
And when I run it I get…
R1C1: A
R1C2: B
R1C3: C
R1C4: D
R1C5: E
R1C6: F
R1C7: G
R1C8: H
R1C9: I
-----
R2C1: 1
R2C2: 2,3
R2C3: 1"2
R2C4: "4,
R2C5: ,5
-----
R3C1: 6
R3C2: this is a "test
R3C3: ",8
R3C4: 9",
R3C5: 10,"
-----
R4C1: normal
R4C2: commas,,,in the field
R4C3: "enclosed"
R4C4: random " double " quotes
R4C5: commas,,, " and double "" quotes
-----
R5C1: ,F"
R5C2: **null**
R5C3: **null**
R5C4: **null**
R5C5: abcde
-----
I think I have covered all the bases and possibilities for parsing a CSV file. You can easily modify the code to store the tokens as rows in a table or push them into an Apex collection for further processing later. I just used dbms_output.put_line() to show it working.
Give it a try and let me know if you find a case that this code does not handle.
Enjoy.
My First APEX Plug-In – Color Picker.
For anyone who does not know by now, APEX 4.0 has been released. And if you have not yet downloaded and upgraded, I say, “What are you waiting for?!?!?!”. Click here NOW and download it. I’ll wait. *insert jeopardy music here*. Ok, done? Great!
There are a ton of new features in this release. One of the most significant ones are Plug-Ins. In a nut shell, Plug-Ins give you, the APEX developer, to ability to extend the APEX development framework with custom item types, region types, and process and dynamic actions.
I finally got around to playing with them and lets just say I am impressed. At first, the idea of writing Plug-Ins seems daunting, but after playing with them for an hour or so, it did not seem so hard. It’s easy to take the example supplied with the sample application and and use it as a outline to make your own. In fact, that is exactly what I did to make my first item Plug-In. I encourage you to do the same. You can also grab my Plug-In and see what I did as well.
In prior APEX releases, if you wanted to include a jQuery plugin on your page, you would have had to embed all the code into the page itself. And there are many different places to include that code. Each developer does it a bit differently. I know myself, I was not very consistent going from page to page within the same application. This made for maintenance nightmares. Plug-Ins fix all of that.
The Plug-In I built was a color picker. I know APEX already has one, but let’s face it, it’s so Web 1.0. I felt it needed a face lift. But I kinda cheated. I did not write one from scratch. Well to be completely honest, I did not write any of the JavaScript code for this color picker. I downloaded a jQuery color picker called Farbtastic written by Steven Wittens and built an APEX Plug-In with the apex_plugin, apex_javascript, and apex_css APIs. And that is what is so powerful about this Plug-In framework. Among other things, it allows you to leverage code and jQuery plugins in APEX natively. So, now I can use my Plug-In on any page of my application as easily as I could use any other item type in APEX. It’s listed right there in the item type select list with textbox, date picker, shuttle, … and the rest of the default item types.
You can check out my color picker Plug-In and download and install it yourself.
Yet Another APEX Blogger.
I would just like to welcome a former colleague and good friend, Brian Spendolini, to the APEX blogging community and I look to interesting and informative posts for him in the weeks and months to come.
Enhanced APEX Shuttle Item
Recently I have been really, really busy building some internal applications for my company. All of them using Oracle and APEX of course. To satisfy some of application’s requirements, and just for usability sake, I had to tweek the default behavior of the APEX shuttle item. The reason was the initial LOV that populated the shuttle contained over 500 values. About 490+ too many as far as I was concerned. The users felt the same way. They complained that it was too hard to find what they were looking for, even though the list was in alphabetical order and I agreed. So, I decided to add the functionality to allow the user to constrain the values in left-hand side of the shuttle. Similar to the Popup LOVs with a filter option, but I wanted to use AJAX to accomplish it to avoid a page submit. After a bunch of playing around and optimization, here is what I came up with:
http://apex.oracle.com/pls/otn/f?p=19903:10
Username/Password: demo/demo
I was surprised that it was not that hard to accomplish and there are very few moving parts. All you need is a javascript function to make the AJAX call, an On Demand process to fetch only those values that meet the filter’s constraint and a way to trigger it.
How It Works.
As the user types in the textbox the shuttle automagically filters itself. I use an onKeyUp event on the textbox P10_SEARCH_BY to trigger the AJAX call.
onKeyUp="populateShuttle( this,'P10_SHUTTLE');"
So everytime a KeyUp event happens in that textbox, a call to the populateShuttle function happens. That function is responsible for making the AJAX call to the On-Demand Process, which fetches the appropriate values, and then parsing the results and repopulating the left hand side of the shuttle.
function populateShuttle(filter,shuttleName)
{
var right = $x(shuttleName+"_RIGHT");
for ( var i=0;i<right .length; i++ )
right.options[i].selected = true;
var req = new htmldb_Get( null, 19903,
'APPLICATION_PROCESS=populateShuttle', 0 );
req.addParam( 'x01', $v(right) );
req.addParam( 'x02', $v(filter) );
var resp = eval(req.get());
var left = $x(shuttleName+"_LEFT");
left.length=0;
if (resp)
for ( var i=0; i<resp.length; i++ )
left.options[i] = new Option(resp[i].data, resp[i].id);
req = null;
}
Although APEX treats the shuttle as a single element, it is in fact made up of many HTML elements, 10 images and 2 multiselect list boxes. With a little poking around the HTML of a generated APEX page with a shuttle item on it, I was able to determine that the actual DOM names of the multiselect list boxes of a shuttle were [SHUTTLE_NAME]_LEFT and [SHUTTLE_NAME]_RIGHT. How convenient. Accessing the actual elements was then trivial. I just used the APEX supplied function $x() function to get the elements.
The first thing I do is get the right textbox and loop over its values, selecting each one. This is so when I fetch its value, I get all the elements in the list. I need this information so when I fetch the values for the left hand side of the shuttle, based on the user’s filter, I do NOT bring back any values that have already been moved to the right hand side of the shuttle. In my example say you had already selected KING and moved him to the right. Then you type a ‘K’ in the filter_by textbox, you will only get back CLARK and BLAKE even though KING also meets the constraint of containing a ‘K’.
Next, I get the value of the right hand side of the shuttle and the value of the filter_by textbox, using the $v() function and send the values to the On Demand process via the APEX global variables g_x01 and g_x02.
The On Demand process is just a simple PL/SQL block that is a query loop.
declare
l_selected long := wwv_flow.g_x01;
l_search long := wwv_flow.g_x02;
begin
htp.prn('[');
for c in (
select ename, empno, rownum r
from emp
where regexp_like( ename, nvl(l_search,ename), 'i' )
and nvl(instr(':'||l_selected||':',':'||empno||':'),0)=0 )
loop
htp.prn( case when c.r>1 then',' else null end ||
'{ id: ' || c.empno ||
', data: "' || c.ename || '"}' );
end loop;
htp.prn(']');
end;
It fetches all the values that match the filter
where regexp_like( ename, nvl(l_search,ename), 'i' )
but not those already be selected by the user.
and nvl(instr(':'||l_selected||':',':'||empno||':'),0)=0)
Now it’s just a simple process of looping over the rows, packaging them up and sending them back. I have become a big fan of JSON ( JavaScript Object Notation http://www.json.org/ ) , so that how I am packaging up the payload to pass back. Basically, I’m making the results into a javascript array of objects. Each object has an id and data element. The payload look something like this
[{id: 7839, data: "KING"}{id: 7698, data: "BLAKE"} ... ]
This makes it super simple easy to work with once it gets back to the calling javascript function. All you need to do is apply an eval() on it and its transformed into a true javascript array of objects that can be easily looped over and referenced.
In one line of code, I call the On Demand process, getting back the JSON packed result and eval() it.
var resp = eval(req.get());
Now the variable resp is an Array of objects with two elements in each object, id and data. Looping over the values and populating the left hand side of the shuttle is trivial now.
if (resp)
for ( var i=0; i<resp.length; i++ )
left.options[i] = new Option(resp[i].data, resp[i].id);
And there you have it. I hope I have shown that with some minor tweeks, you can extend the functionality of the base APEX items to be much more functional and user friendly. As always, comments ( good and bad ) are always welcome. Read more…
Pipelined Functions
Well, it’s been a quite while since I posted (I know, I’m slacking) so I thought I’d just quickly blog about what I just helped someone with yesterday. It required me to use a pipelined function. Not everyone has seen of them so I thought it may be interesting.
Pipelined functions allow you to basically select from a function rather than a database table.
A colleague of mine had to build an APEX report of opportunities. The report was to be constrained by a multiselect listbox of product ids. Show all opportunities that have ANY of the selected product ids. So far it sound pretty easy. The tricky part was how the product ids were stored in the opportunity table.
create table opportunity( name varchar2(100), product_ids varchar2(255) );
The PRODUCT_IDS column was populated with a colon delimited list of all the products that were used in that opportunity. Now before you yell that that is probably not the best way to store that data and that a master-detail table relationship is probably the better way to go, I 100% agree. But unfortunately, when you inherit an application, you many times must make do with what the original developer put in place (and that is the case here).
We all know when you submit a multiselect listbox in APEX, it comes in as a colon delimited list of all the values that were selected. What we need to do is tokenize that selectlist string and compare each value against the table.
I came up with two possible solutions:
1. Build the query by hand.
Using apex_util.string_to_table() on the input, loop over all the values and manually build the report query by hand.
declare
l_ids apex_application_global.vc_arr2;
l_query varchar2(32767) := null;
begin
l_ids := apex_util.string_to_table( :p9_product_ids );
for i in 1 .. l_ids.count loop
l_query := l_query || <insert query here>;
if i <> l_ids.count then
l_query := l_query || ' union all ';
end if';
end loop;
return l_query;
end;
This would have worked just fine (besides having to run the query N times and union all them together).
But I like to use nifty, (and more optimized) options and so I implemented my other solution:
2. Use a pipelined function.
To use this, we need to create a type and a function that returns that type.
create or replace type myTableType as table of varchar2(255) /
create or replace
function to_myTableType( p_string varchar2 )
return myTableType PIPELINED is
l_arr apex_application_global.vc_arr2;
begin
l_arr := apex_util.string_to_table( p_string );
for i in 1 .. l_arr.count loop
PIPE ROW( ':' || l_arr(i) || ':' );
end loop;
return;
end;
/
Notice that the function is declared as PIPELINED in its declaration. Also notice the PIPE ROW() syntax in the loop. That is where the function returns multiple values as rows in a query. Its using the string_to_table() function to tokenize the string and then looping over the tokens, pipes the results out as rows. And finally, this function contains a return clause without returning anything. It had already returned multiple values in the PIPE ROW() line.
Now that we have this we can write a single, simple query in APEX that looks like this:
select o.name, o.product_ids
from opportunity o,
table( to_myTableType( :p1_product_ids )) p
where instr( ':' || o.product_ids || ':', p.COLUMN_VALUE ) > 0;
Check out a working version here: http://apex.oracle.com/pls/otn/f?p=19903:9
WebLogic Server and APEX
A good friend and colleague of mine at Oracle, Mark Greynolds, shared with me his work with WebLogic and APEX and getting them to work together. I thought it was something that everyone would be interested in and asked him if he would be willing to document the steps and allow me to post them here on my blog. He was and so here it is.
WebLogic Server and APEX
When a WebLogic Server (WLS) is the primary Web server, accessing APEX pages though the WLS requires a proxy. The configuration of APEX generally follows one of two configurations – Apache with mod_plsql or the Embedded PL/SQL gateway. When WebLogic (without Apache) is the main HTTP server, getting APEX to surface on the same port as WebLogic requires some form of proxy.
Overall Approach
This solution creates a very simple Web Application that invokes a Java Proxy Servlet when a user tries to access APEX pages. Wrapping this Web Application around the Java Proxy Servlet lets the WLS serve APEX without any port conflicts. The WLS Proxy Servlet is a fully documented out of the box tool. To create and deploy the Web Application simply build the files outlined in this document, deploy the application and then access APEX.
Exploded deployment
For convenience, this solution takes advantage of the exploded deployment feature of the WSL. In addition to the ability to deploy jar, war and ear files, the WLS can deploy Web Applications as an exploded directory that contains the same contents of the archive file. An exploded archive directory contains the same files and directories as a jar archive. However, the files and directories reside directly in the file system instead of as a single archive file (JAR). This example uses the exploded deployment style to create the Web Application for this example.
Default WebLogic application
The default Web Application is what the server sends to browser clients who do not specify a recognized URI (or specify only “/” as the URI). Each server has only one default Web Application and for this solution to work, this application must be set as the default. If there is already a default, this servlet could be added to the existing application by using an exploded deployment of the default with modification to the web.xml to register the APEX proxy.
Pre-requisites
1. An Oracle database successfully serving APEX pages. The APEX instance may be on the same or different machine and served from either Apache or the Embedded PL/SQL gateway. In this example, APEX uses the Embedded PL/SQL Gateway of a database running on the same machine as the WebLogic server and natively appears at the http://localhost:8080/apex URL.
2. An Oracle WebLogic Server 10.3 running a Node Manger, the Administration Console and a Managed server. This example uses a domain created specifically for this exercise named APEXDemo. The WLS Administration console uses port 7001 and the Managed Server uses port 80.
3. There is no other “default” WebLogic application for the Managed Server.
Create the APEX Proxy Servlet
1. Create the following directory structure somewhere on disk. This example assumes the C:\ drive is used. Note: the apexproxy.war directory name mimics the normal J2EE naming convention for Web Application archive (WAR).
2. In the apexproxy.war directory, create the index.html file. The WLS Managed Server renders this page when the server cannot map the browser’s URL to a valid destination. For this example, APEX becomes the default page due to a simple redirect to the full APEX path.
<html>
<head>
<meta http-equiv="refresh" content="0;url=apex">
</head>
<body>
</body>
</html>
3. In the WEB-INF directory, create the web.xml file that defines this simple Web Application.
<?xml version="1.0" encoding="UTF-8"?>
<web-app>
<display-name>APEX Proxy</display-name>
<servlet>
<servlet-name>ProxyServlet</servlet-name>
<servlet-class>
weblogic.servlet.proxy.HttpProxyServlet
</servlet-class>
<init-param>
<param-name>redirectURL</param-name>
<param-value>http://localhost:8080</param-value>
</init-param>
</servlet>
<servlet-mapping>
<servlet-name>ProxyServlet</servlet-name>
<url-pattern>/apex/*</url-pattern>
</servlet-mapping>
<servlet-mapping>
<servlet-name>ProxyServlet</servlet-name>
<url-pattern>/i/*</url-pattern>
</servlet-mapping>
<welcome-file-list>
<welcome-file>index.html</welcome-file>
</welcome-file-list>
</web-app>
The <servlet-class> tag identifies the proxy servlet class and provides the
redirection URL to use when the container invokes the servlet. The two
<servlet-mapping> tags explicitly identify the two URL patterns used by APEX.
The WLS documentation suggests using a single mapping of “/” but this causes every
unmatched request to forward to the Oracle XML DB default splash page and ignores the
index.html file.
4. In the WEB-INF directory, also create the following weblogic.xml file. At deployment, WLS scans this file for the information on how to configure the deployment.
<?xml version='1.0' encoding='UTF-8'?>
<weblogic-web-app
xmlns="http://www.bea.com/ns/weblogic/weblogic-web-app"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.bea.com/ns/weblogic/weblogic-web-app
http://www.bea.com/ns/weblogic/weblogic-web-app/1.0/weblogic-web-app.xsd">
<session-descriptor></session-descriptor>
<jsp-descriptor></jsp-descriptor>
<container-descriptor></container-descriptor>
<context-root>/</context-root>
<servlet-descriptor>
<servlet-name>ProxyServlet</servlet-name>
</servlet-descriptor>
</weblogic-web-app>
The <context-root> tag is the key for making this example function. This tag
makes this the default application for the Managed Server.
Deploy the Proxy Servlet
5. Begin the deployment by accessing the WLS Administration Console using the http://localhost:7001/console URL. For this example, the Administration Console’s Username and Password are weblogic and weblogic.
6. On the left side navigation panel, click the Deployments link.
7. Click the Install button.
8. Use the Path: field to locate the C:\APEXProxy directory. NOTE: this page can
navigate directories on the drive by entering C:\ and clicking the Next button. The server
returns an error and shows the directories in a navigable list.
Click the radio button next to the apexproxy.war directory name and then click the
Next button. NOTE: do not type in the complete path to the war directory – clicking the
radio button for the WAR automatically completes the Path specification.
9. Click the Next button to accept installing the deployment as an application.
10. Select the desired server and click the Next button.
11. Click the Finish button to accept default values for all the remaining deployment values
and finish the deployment.
12. The Administration Console displays the configuration page for the newly deployed
application with the following messages.
Lower down the page, the server display the status of the new application.
13. Enter the Managed Server URL in to a browser to see the APEX login page.
The browser receives a redirect from the index.html page and displays the APEX login
screen.










